{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**MongoDB作业（基于PyMongo的电影影评分析）-19数科1-严超超-19211870125**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "# 基于MongoDB的电影影评分析"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- **简介：**\n",
    "<br />\n",
    "\n",
    "作业介绍：\n",
    "\n",
    "数据来源基于Python的第三方库，即`requests库`，`bs4库`，`re库`爬取豆瓣网TOP10的电影信息，以及它们的部分影评信息（100个左右）。\n",
    "\n",
    "将爬取的信息进行预处理，封装成dict字典，借助 `pymongo库` 连接本机的 MongoDB，向数据库插入之前爬取的真实数据，然后分别使用MongoDB提供的map_reduce机制以及agreegate机制来聚合、分组、汇总计算数据，以MongoDB为基础，存储影视信息和评论信息，同时分析电影的综合价值。\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 目录\n",
    "\n",
    "**一、从豆瓣网爬取Top10的电影数据**\n",
    "- 1.1 爬取Top10的影视信息\n",
    "- 1.2 爬取每个电影的评论情况\n",
    "- 1.3 整理爬取的数据\n",
    "\n",
    "**二、MongoDB 操作豆瓣影评数据集**\n",
    "- 2.1 创建 MongoDB 连接实例\n",
    "- 2.2 向 MongoDB 集合插入文档\n",
    "- 2.3 查看插入到 MongoDB的数据\n",
    "- 2.4 同样的操作插入影评\n",
    "- 2.5 插入影评信息到 MongoDB\n",
    "\n",
    "**三、MongoDB 实战**\n",
    "- 3.1 计算豆瓣 Top10 影视的平均评分\n",
    "- 3.2 统计Top10电影影评的`[赞同/不赞同]`的平均比率\n",
    "- 3.3 统计Top10电影在2022年的评论情况"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import requests\n",
    "import re\n",
    "from bs4 import BeautifulSoup\n",
    "import pandas as pd\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 一、从豆瓣网爬取Top10的电影数据\n",
    "\n",
    "爬取网址: https://movie.douban.com/top250\n",
    "---\n",
    "\n",
    "## 1.1 爬取Top10的影视信息"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[info] >> 豆瓣Top10影视信息已保存到当前目录下的./data/mv_info.xlsx\n"
     ]
    }
   ],
   "source": [
    "# 一、导库\n",
    "import requests\n",
    "import re\n",
    "import pandas as pd\n",
    "from bs4 import BeautifulSoup\n",
    "\n",
    "# 二、设置请求网页的信息: 网址url + header请求头\n",
    "headers = {\n",
    "    'User-Agent':  'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.127 Safari/537.36',\n",
    "}\n",
    "url = 'https://movie.douban.com/top250'\n",
    "# 三、使用 requests库 获取网页响应的结果文件\n",
    "html = requests.get(url, headers=headers).text\n",
    "\n",
    "# 四、使用 BeautifulSoup库 解析HTML网页内容\n",
    "soup = BeautifulSoup(html)\n",
    "\n",
    "# 五、结果bs库和re库的正则表达式提取需要的数据\n",
    "mv_data = []\n",
    "i = 0\n",
    "for x in soup.select('.item'):\n",
    "    i += 1\n",
    "    mv_name = re.search(\n",
    "        '>([^<]+)<', str(x.select('.info > .hd > a > .title'))).group(1)\n",
    "    # 电影网址\n",
    "    mv_href = re.search(\n",
    "        'href=\"(.*)\"', str(x.select('.info > .hd > a'))).group(1)\n",
    "    # 电影详细信息\n",
    "    mv_info = x.select('.info > .bd > p')\n",
    "    # 电影部分演员表\n",
    "    mv_actors = re.search('>([^<]+)<', str(mv_info)\n",
    "                          ).group(1).strip().replace('\\xa0', '')\n",
    "    # 电影发布时间、发布国家、发布类型\n",
    "    mv_type = re.search('<br/>([^<]+)</p>', str(mv_info)).group(1).strip().replace('\\xa0', '')\n",
    "    # 电影的简评\n",
    "    mv_review = re.search(\n",
    "        '>([^<]+)<', str(x.select('.info > .bd > p > .inq'))).group(1)\n",
    "    # 电影评分\n",
    "    mv_star = re.search(\n",
    "        '>([^<]+)<', str(x.select('.info > .bd > .star > .rating_num'))).group(1)\n",
    "    # 电影的评价数\n",
    "    mv_evaNum = re.search(\n",
    "        '([0-9]+)人评价', str(x.select('.info > .bd > .star'))).group(1)\n",
    "    mv_data.append({\n",
    "        'mv_id': mv_href.split('/')[-2:-1][0],\n",
    "        'mv_rank': i,\n",
    "        'mv_name': mv_name,\n",
    "        'mv_href': mv_href,\n",
    "        'mv_actors': mv_actors,\n",
    "        'mv_type': mv_type,\n",
    "        'mv_review': mv_review,\n",
    "        'mv_star': mv_star,\n",
    "        'mv_evaNum': mv_evaNum,\n",
    "    })\n",
    "\n",
    "mv_data = pd.DataFrame(data=mv_data[:10])\n",
    "mv_data.to_excel('./data/mv_info.xlsx')\n",
    "print('[info] >> 豆瓣Top10影视信息已保存到当前目录下的./data/mv_info.xlsx')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>mv_id</th>\n",
       "      <th>mv_rank</th>\n",
       "      <th>mv_name</th>\n",
       "      <th>mv_href</th>\n",
       "      <th>mv_actors</th>\n",
       "      <th>mv_type</th>\n",
       "      <th>mv_review</th>\n",
       "      <th>mv_star</th>\n",
       "      <th>mv_evaNum</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1292052</td>\n",
       "      <td>1</td>\n",
       "      <td>肖申克的救赎</td>\n",
       "      <td>https://movie.douban.com/subject/1292052/</td>\n",
       "      <td>导演: 弗兰克·德拉邦特 Frank Darabont主演: 蒂姆·罗宾斯 Tim Robb...</td>\n",
       "      <td>1994/美国/犯罪 剧情</td>\n",
       "      <td>希望让人自由。</td>\n",
       "      <td>9.7</td>\n",
       "      <td>2611849</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1291546</td>\n",
       "      <td>2</td>\n",
       "      <td>霸王别姬</td>\n",
       "      <td>https://movie.douban.com/subject/1291546/</td>\n",
       "      <td>导演: 陈凯歌 Kaige Chen主演: 张国荣 Leslie Cheung / 张丰毅 ...</td>\n",
       "      <td>1993/中国大陆 中国香港/剧情 爱情 同性</td>\n",
       "      <td>风华绝代。</td>\n",
       "      <td>9.6</td>\n",
       "      <td>1939306</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>1292720</td>\n",
       "      <td>3</td>\n",
       "      <td>阿甘正传</td>\n",
       "      <td>https://movie.douban.com/subject/1292720/</td>\n",
       "      <td>导演: 罗伯特·泽米吉斯 Robert Zemeckis主演: 汤姆·汉克斯 Tom Han...</td>\n",
       "      <td>1994/美国/剧情 爱情</td>\n",
       "      <td>一部美国近现代史。</td>\n",
       "      <td>9.5</td>\n",
       "      <td>1963347</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>1292722</td>\n",
       "      <td>4</td>\n",
       "      <td>泰坦尼克号</td>\n",
       "      <td>https://movie.douban.com/subject/1292722/</td>\n",
       "      <td>导演: 詹姆斯·卡梅隆 James Cameron主演: 莱昂纳多·迪卡普里奥 Leonar...</td>\n",
       "      <td>1997/美国 墨西哥 澳大利亚 加拿大/剧情 爱情 灾难</td>\n",
       "      <td>失去的才是永恒的。</td>\n",
       "      <td>9.4</td>\n",
       "      <td>1923794</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>1295644</td>\n",
       "      <td>5</td>\n",
       "      <td>这个杀手不太冷</td>\n",
       "      <td>https://movie.douban.com/subject/1295644/</td>\n",
       "      <td>导演: 吕克·贝松 Luc Besson主演: 让·雷诺 Jean Reno / 娜塔莉·波...</td>\n",
       "      <td>1994/法国 美国/剧情 动作 犯罪</td>\n",
       "      <td>怪蜀黍和小萝莉不得不说的故事。</td>\n",
       "      <td>9.4</td>\n",
       "      <td>2115656</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>1292063</td>\n",
       "      <td>6</td>\n",
       "      <td>美丽人生</td>\n",
       "      <td>https://movie.douban.com/subject/1292063/</td>\n",
       "      <td>导演: 罗伯托·贝尼尼 Roberto Benigni主演: 罗伯托·贝尼尼 Roberto...</td>\n",
       "      <td>1997/意大利/剧情 喜剧 爱情 战争</td>\n",
       "      <td>最美的谎言。</td>\n",
       "      <td>9.6</td>\n",
       "      <td>1206493</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>1291561</td>\n",
       "      <td>7</td>\n",
       "      <td>千与千寻</td>\n",
       "      <td>https://movie.douban.com/subject/1291561/</td>\n",
       "      <td>导演: 宫崎骏 Hayao Miyazaki主演: 柊瑠美 Rumi Hîragi / 入野...</td>\n",
       "      <td>2001/日本/剧情 动画 奇幻</td>\n",
       "      <td>最好的宫崎骏，最好的久石让。</td>\n",
       "      <td>9.4</td>\n",
       "      <td>2040396</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>1295124</td>\n",
       "      <td>8</td>\n",
       "      <td>辛德勒的名单</td>\n",
       "      <td>https://movie.douban.com/subject/1295124/</td>\n",
       "      <td>导演: 史蒂文·斯皮尔伯格 Steven Spielberg主演: 连姆·尼森 Liam N...</td>\n",
       "      <td>1993/美国/剧情 历史 战争</td>\n",
       "      <td>拯救一个人，就是拯救整个世界。</td>\n",
       "      <td>9.6</td>\n",
       "      <td>1006347</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>3541415</td>\n",
       "      <td>9</td>\n",
       "      <td>盗梦空间</td>\n",
       "      <td>https://movie.douban.com/subject/3541415/</td>\n",
       "      <td>导演: 克里斯托弗·诺兰 Christopher Nolan主演: 莱昂纳多·迪卡普里奥 L...</td>\n",
       "      <td>2010/美国 英国/剧情 科幻 悬疑 冒险</td>\n",
       "      <td>诺兰给了我们一场无法盗取的梦。</td>\n",
       "      <td>9.4</td>\n",
       "      <td>1882081</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>3011091</td>\n",
       "      <td>10</td>\n",
       "      <td>忠犬八公的故事</td>\n",
       "      <td>https://movie.douban.com/subject/3011091/</td>\n",
       "      <td>导演: 莱塞·霍尔斯道姆 Lasse Hallström主演: 理查·基尔 Richard ...</td>\n",
       "      <td>2009/美国 英国/剧情</td>\n",
       "      <td>永远都不能忘记你所爱的人。</td>\n",
       "      <td>9.4</td>\n",
       "      <td>1287928</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     mv_id  mv_rank  mv_name                                    mv_href  \\\n",
       "0  1292052        1   肖申克的救赎  https://movie.douban.com/subject/1292052/   \n",
       "1  1291546        2     霸王别姬  https://movie.douban.com/subject/1291546/   \n",
       "2  1292720        3     阿甘正传  https://movie.douban.com/subject/1292720/   \n",
       "3  1292722        4    泰坦尼克号  https://movie.douban.com/subject/1292722/   \n",
       "4  1295644        5  这个杀手不太冷  https://movie.douban.com/subject/1295644/   \n",
       "5  1292063        6     美丽人生  https://movie.douban.com/subject/1292063/   \n",
       "6  1291561        7     千与千寻  https://movie.douban.com/subject/1291561/   \n",
       "7  1295124        8   辛德勒的名单  https://movie.douban.com/subject/1295124/   \n",
       "8  3541415        9     盗梦空间  https://movie.douban.com/subject/3541415/   \n",
       "9  3011091       10  忠犬八公的故事  https://movie.douban.com/subject/3011091/   \n",
       "\n",
       "                                           mv_actors  \\\n",
       "0  导演: 弗兰克·德拉邦特 Frank Darabont主演: 蒂姆·罗宾斯 Tim Robb...   \n",
       "1  导演: 陈凯歌 Kaige Chen主演: 张国荣 Leslie Cheung / 张丰毅 ...   \n",
       "2  导演: 罗伯特·泽米吉斯 Robert Zemeckis主演: 汤姆·汉克斯 Tom Han...   \n",
       "3  导演: 詹姆斯·卡梅隆 James Cameron主演: 莱昂纳多·迪卡普里奥 Leonar...   \n",
       "4  导演: 吕克·贝松 Luc Besson主演: 让·雷诺 Jean Reno / 娜塔莉·波...   \n",
       "5  导演: 罗伯托·贝尼尼 Roberto Benigni主演: 罗伯托·贝尼尼 Roberto...   \n",
       "6  导演: 宫崎骏 Hayao Miyazaki主演: 柊瑠美 Rumi Hîragi / 入野...   \n",
       "7  导演: 史蒂文·斯皮尔伯格 Steven Spielberg主演: 连姆·尼森 Liam N...   \n",
       "8  导演: 克里斯托弗·诺兰 Christopher Nolan主演: 莱昂纳多·迪卡普里奥 L...   \n",
       "9  导演: 莱塞·霍尔斯道姆 Lasse Hallström主演: 理查·基尔 Richard ...   \n",
       "\n",
       "                         mv_type        mv_review mv_star mv_evaNum  \n",
       "0                  1994/美国/犯罪 剧情          希望让人自由。     9.7   2611849  \n",
       "1        1993/中国大陆 中国香港/剧情 爱情 同性            风华绝代。     9.6   1939306  \n",
       "2                  1994/美国/剧情 爱情        一部美国近现代史。     9.5   1963347  \n",
       "3  1997/美国 墨西哥 澳大利亚 加拿大/剧情 爱情 灾难       失去的才是永恒的。      9.4   1923794  \n",
       "4            1994/法国 美国/剧情 动作 犯罪  怪蜀黍和小萝莉不得不说的故事。     9.4   2115656  \n",
       "5           1997/意大利/剧情 喜剧 爱情 战争           最美的谎言。     9.6   1206493  \n",
       "6               2001/日本/剧情 动画 奇幻  最好的宫崎骏，最好的久石让。      9.4   2040396  \n",
       "7               1993/美国/剧情 历史 战争  拯救一个人，就是拯救整个世界。     9.6   1006347  \n",
       "8         2010/美国 英国/剧情 科幻 悬疑 冒险  诺兰给了我们一场无法盗取的梦。     9.4   1882081  \n",
       "9                  2009/美国 英国/剧情    永远都不能忘记你所爱的人。     9.4   1287928  "
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 查看爬取结果\n",
    "mv_data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.2 爬取每个电影的评论情况"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "==================================================【启动爬虫程序】==================================================\n",
      "--------------------------------------------------BEGIN:40--------------------------------------------------\n",
      "[INFO] >> 正在爬取 id = 1292052, 名称=肖申克的救赎, url=https://movie.douban.com/subject/1292052/reviews?start= 的影评... \n",
      "\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292052/reviews?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292052/reviews?start=?rating=1?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292052/reviews?start=?rating=2?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292052/reviews?start=?rating=3?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292052/reviews?start=?rating=4?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292052/reviews?start=?rating=5?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292052/reviews?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292052/reviews?start=?rating=1?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292052/reviews?start=?rating=2?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292052/reviews?start=?rating=3?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292052/reviews?start=?rating=4?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292052/reviews?start=?rating=5?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292052/reviews?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292052/reviews?start=?rating=1?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292052/reviews?start=?rating=2?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292052/reviews?start=?rating=3?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292052/reviews?start=?rating=4?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292052/reviews?start=?rating=5?start=40 的评论\n",
      "[INFO] >> 爬取的评论结果保存到了当前文件夹的: ./data/reviews-肖申克的救赎.xlsx\n",
      "[INFO] >> 为防止反爬虫机制启动，睡眠 1 s\n",
      "-----------------------------------END:40 爬取完毕, 本次爬取耗时:21.402 s-----------------------------------\n",
      "--------------------------------------------------BEGIN:40--------------------------------------------------\n",
      "[INFO] >> 正在爬取 id = 1291546, 名称=霸王别姬, url=https://movie.douban.com/subject/1291546/reviews?start= 的影评... \n",
      "\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291546/reviews?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291546/reviews?start=?rating=1?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291546/reviews?start=?rating=2?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291546/reviews?start=?rating=3?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291546/reviews?start=?rating=4?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291546/reviews?start=?rating=5?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291546/reviews?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291546/reviews?start=?rating=1?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291546/reviews?start=?rating=2?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291546/reviews?start=?rating=3?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291546/reviews?start=?rating=4?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291546/reviews?start=?rating=5?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291546/reviews?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291546/reviews?start=?rating=1?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291546/reviews?start=?rating=2?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291546/reviews?start=?rating=3?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291546/reviews?start=?rating=4?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291546/reviews?start=?rating=5?start=40 的评论\n",
      "[INFO] >> 爬取的评论结果保存到了当前文件夹的: ./data/reviews-霸王别姬.xlsx\n",
      "[INFO] >> 为防止反爬虫机制启动，睡眠 1 s\n",
      "-----------------------------------END:40 爬取完毕, 本次爬取耗时:22.379 s-----------------------------------\n",
      "--------------------------------------------------BEGIN:40--------------------------------------------------\n",
      "[INFO] >> 正在爬取 id = 1292720, 名称=阿甘正传, url=https://movie.douban.com/subject/1292720/reviews?start= 的影评... \n",
      "\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292720/reviews?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292720/reviews?start=?rating=1?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292720/reviews?start=?rating=2?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292720/reviews?start=?rating=3?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292720/reviews?start=?rating=4?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292720/reviews?start=?rating=5?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292720/reviews?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292720/reviews?start=?rating=1?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292720/reviews?start=?rating=2?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292720/reviews?start=?rating=3?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292720/reviews?start=?rating=4?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292720/reviews?start=?rating=5?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292720/reviews?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292720/reviews?start=?rating=1?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292720/reviews?start=?rating=2?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292720/reviews?start=?rating=3?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292720/reviews?start=?rating=4?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292720/reviews?start=?rating=5?start=40 的评论\n",
      "[INFO] >> 爬取的评论结果保存到了当前文件夹的: ./data/reviews-阿甘正传.xlsx\n",
      "[INFO] >> 为防止反爬虫机制启动，睡眠 1 s\n",
      "-----------------------------------END:40 爬取完毕, 本次爬取耗时:19.896 s-----------------------------------\n",
      "--------------------------------------------------BEGIN:40--------------------------------------------------\n",
      "[INFO] >> 正在爬取 id = 1292722, 名称=泰坦尼克号, url=https://movie.douban.com/subject/1292722/reviews?start= 的影评... \n",
      "\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292722/reviews?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292722/reviews?start=?rating=1?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292722/reviews?start=?rating=2?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292722/reviews?start=?rating=3?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292722/reviews?start=?rating=4?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292722/reviews?start=?rating=5?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292722/reviews?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292722/reviews?start=?rating=1?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292722/reviews?start=?rating=2?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292722/reviews?start=?rating=3?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292722/reviews?start=?rating=4?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292722/reviews?start=?rating=5?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292722/reviews?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292722/reviews?start=?rating=1?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292722/reviews?start=?rating=2?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292722/reviews?start=?rating=3?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292722/reviews?start=?rating=4?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292722/reviews?start=?rating=5?start=40 的评论\n",
      "[INFO] >> 爬取的评论结果保存到了当前文件夹的: ./data/reviews-泰坦尼克号.xlsx\n",
      "[INFO] >> 为防止反爬虫机制启动，睡眠 1 s\n",
      "-----------------------------------END:40 爬取完毕, 本次爬取耗时:19.470 s-----------------------------------\n",
      "--------------------------------------------------BEGIN:40--------------------------------------------------\n",
      "[INFO] >> 正在爬取 id = 1295644, 名称=这个杀手不太冷, url=https://movie.douban.com/subject/1295644/reviews?start= 的影评... \n",
      "\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295644/reviews?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295644/reviews?start=?rating=1?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295644/reviews?start=?rating=2?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295644/reviews?start=?rating=3?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295644/reviews?start=?rating=4?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295644/reviews?start=?rating=5?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295644/reviews?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295644/reviews?start=?rating=1?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295644/reviews?start=?rating=2?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295644/reviews?start=?rating=3?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295644/reviews?start=?rating=4?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295644/reviews?start=?rating=5?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295644/reviews?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295644/reviews?start=?rating=1?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295644/reviews?start=?rating=2?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295644/reviews?start=?rating=3?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295644/reviews?start=?rating=4?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295644/reviews?start=?rating=5?start=40 的评论\n",
      "[INFO] >> 爬取的评论结果保存到了当前文件夹的: ./data/reviews-这个杀手不太冷.xlsx\n",
      "[INFO] >> 为防止反爬虫机制启动，睡眠 1 s\n",
      "-----------------------------------END:40 爬取完毕, 本次爬取耗时:20.906 s-----------------------------------\n",
      "--------------------------------------------------BEGIN:40--------------------------------------------------\n",
      "[INFO] >> 正在爬取 id = 1292063, 名称=美丽人生, url=https://movie.douban.com/subject/1292063/reviews?start= 的影评... \n",
      "\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292063/reviews?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292063/reviews?start=?rating=1?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292063/reviews?start=?rating=2?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292063/reviews?start=?rating=3?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292063/reviews?start=?rating=4?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292063/reviews?start=?rating=5?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292063/reviews?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292063/reviews?start=?rating=1?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292063/reviews?start=?rating=2?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292063/reviews?start=?rating=3?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292063/reviews?start=?rating=4?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292063/reviews?start=?rating=5?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292063/reviews?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292063/reviews?start=?rating=1?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292063/reviews?start=?rating=2?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292063/reviews?start=?rating=3?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292063/reviews?start=?rating=4?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1292063/reviews?start=?rating=5?start=40 的评论\n",
      "[INFO] >> 爬取的评论结果保存到了当前文件夹的: ./data/reviews-美丽人生.xlsx\n",
      "[INFO] >> 为防止反爬虫机制启动，睡眠 1 s\n",
      "-----------------------------------END:40 爬取完毕, 本次爬取耗时:28.229 s-----------------------------------\n",
      "--------------------------------------------------BEGIN:40--------------------------------------------------\n",
      "[INFO] >> 正在爬取 id = 1291561, 名称=千与千寻, url=https://movie.douban.com/subject/1291561/reviews?start= 的影评... \n",
      "\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291561/reviews?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291561/reviews?start=?rating=1?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291561/reviews?start=?rating=2?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291561/reviews?start=?rating=3?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291561/reviews?start=?rating=4?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291561/reviews?start=?rating=5?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291561/reviews?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291561/reviews?start=?rating=1?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291561/reviews?start=?rating=2?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291561/reviews?start=?rating=3?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291561/reviews?start=?rating=4?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291561/reviews?start=?rating=5?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291561/reviews?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291561/reviews?start=?rating=1?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291561/reviews?start=?rating=2?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291561/reviews?start=?rating=3?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291561/reviews?start=?rating=4?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1291561/reviews?start=?rating=5?start=40 的评论\n",
      "[INFO] >> 爬取的评论结果保存到了当前文件夹的: ./data/reviews-千与千寻.xlsx\n",
      "[INFO] >> 为防止反爬虫机制启动，睡眠 1 s\n",
      "-----------------------------------END:40 爬取完毕, 本次爬取耗时:21.946 s-----------------------------------\n",
      "--------------------------------------------------BEGIN:40--------------------------------------------------\n",
      "[INFO] >> 正在爬取 id = 1295124, 名称=辛德勒的名单, url=https://movie.douban.com/subject/1295124/reviews?start= 的影评... \n",
      "\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295124/reviews?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295124/reviews?start=?rating=1?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295124/reviews?start=?rating=2?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295124/reviews?start=?rating=3?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295124/reviews?start=?rating=4?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295124/reviews?start=?rating=5?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295124/reviews?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295124/reviews?start=?rating=1?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295124/reviews?start=?rating=2?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295124/reviews?start=?rating=3?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295124/reviews?start=?rating=4?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295124/reviews?start=?rating=5?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295124/reviews?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295124/reviews?start=?rating=1?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295124/reviews?start=?rating=2?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295124/reviews?start=?rating=3?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295124/reviews?start=?rating=4?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/1295124/reviews?start=?rating=5?start=40 的评论\n",
      "[INFO] >> 爬取的评论结果保存到了当前文件夹的: ./data/reviews-辛德勒的名单.xlsx\n",
      "[INFO] >> 为防止反爬虫机制启动，睡眠 1 s\n",
      "-----------------------------------END:40 爬取完毕, 本次爬取耗时:21.505 s-----------------------------------\n",
      "--------------------------------------------------BEGIN:40--------------------------------------------------\n",
      "[INFO] >> 正在爬取 id = 3541415, 名称=盗梦空间, url=https://movie.douban.com/subject/3541415/reviews?start= 的影评... \n",
      "\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3541415/reviews?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3541415/reviews?start=?rating=1?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3541415/reviews?start=?rating=2?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3541415/reviews?start=?rating=3?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3541415/reviews?start=?rating=4?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3541415/reviews?start=?rating=5?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3541415/reviews?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3541415/reviews?start=?rating=1?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3541415/reviews?start=?rating=2?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3541415/reviews?start=?rating=3?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3541415/reviews?start=?rating=4?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3541415/reviews?start=?rating=5?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3541415/reviews?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3541415/reviews?start=?rating=1?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3541415/reviews?start=?rating=2?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3541415/reviews?start=?rating=3?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3541415/reviews?start=?rating=4?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3541415/reviews?start=?rating=5?start=40 的评论\n",
      "[INFO] >> 爬取的评论结果保存到了当前文件夹的: ./data/reviews-盗梦空间.xlsx\n",
      "[INFO] >> 为防止反爬虫机制启动，睡眠 1 s\n",
      "-----------------------------------END:40 爬取完毕, 本次爬取耗时:23.484 s-----------------------------------\n",
      "--------------------------------------------------BEGIN:40--------------------------------------------------\n",
      "[INFO] >> 正在爬取 id = 3011091, 名称=忠犬八公的故事, url=https://movie.douban.com/subject/3011091/reviews?start= 的影评... \n",
      "\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3011091/reviews?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3011091/reviews?start=?rating=1?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3011091/reviews?start=?rating=2?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3011091/reviews?start=?rating=3?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3011091/reviews?start=?rating=4?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3011091/reviews?start=?rating=5?start=0 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3011091/reviews?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3011091/reviews?start=?rating=1?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3011091/reviews?start=?rating=2?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3011091/reviews?start=?rating=3?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3011091/reviews?start=?rating=4?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3011091/reviews?start=?rating=5?start=20 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3011091/reviews?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3011091/reviews?start=?rating=1?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3011091/reviews?start=?rating=2?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3011091/reviews?start=?rating=3?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3011091/reviews?start=?rating=4?start=40 的评论\n",
      "[INFO] >> 开始爬取 https://movie.douban.com/subject/3011091/reviews?start=?rating=5?start=40 的评论\n",
      "[INFO] >> 爬取的评论结果保存到了当前文件夹的: ./data/reviews-忠犬八公的故事.xlsx\n",
      "[INFO] >> 为防止反爬虫机制启动，睡眠 1 s\n",
      "-----------------------------------END:40 爬取完毕, 本次爬取耗时:19.586 s-----------------------------------\n",
      "[INFO] >> 所有影评的评分情况已保存到本地的文件: ./mv_stars.xlsx中\n",
      "=============================================程序执行完毕，总耗时: 218.887 s=============================================\n"
     ]
    }
   ],
   "source": [
    "# 一、导库\n",
    "import requests\n",
    "import re\n",
    "import pandas as pd\n",
    "from bs4 import BeautifulSoup\n",
    "import time\n",
    "\n",
    "'''\n",
    "    记录所有的评分情况\n",
    "'''\n",
    "mv_stars = []\n",
    "# 二、设置请求网页的信息: 网址url + header请求头\n",
    "headers = {\n",
    "    'User-Agent':  'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.127 Safari/537.36',\n",
    "}\n",
    "'''\n",
    "    匹配获取到 >内容< 格式里的内容\n",
    "'''\n",
    "def getText(s):\n",
    "    # 如果不是str类型，先转为str\n",
    "    if(type(s) != str):\n",
    "        s = str(s).strip().replace('\\n', '')\n",
    "    text = re.search('>([^\\\\)]*)<', s)\n",
    "    # 返回匹配的结果\n",
    "    return text.group(1) if text != None else None \n",
    "'''\n",
    "    爬取评论信息\n",
    "'''\n",
    "def getInfo(url: str, mv_id: int) -> pd.DataFrame:\n",
    "    print('[INFO] >> 开始爬取 ' + url + ' 的评论')\n",
    "    result = []\n",
    "    # 获取网页响应的结果文件\n",
    "    html = requests.get(url, headers=headers).text\n",
    "    # 解析HTML网页内容\n",
    "    soup = BeautifulSoup(html)\n",
    "    for x in soup.select('.review-item'):\n",
    "        # 评论的用户名\n",
    "        rv_name = getText(x.select('.name'))\n",
    "        # 评论的时间\n",
    "        rv_time = getText(x.select('.main-meta'))\n",
    "        # 评论的内容\n",
    "        rv_info = getText(x.select('.review-short > .short-content')).split('(')[0].replace('<p class=\"spoiler-tip\">这篇影评可能有剧透</p>','').strip()\n",
    "        # 评论的支持与反对\n",
    "        rv_action_agree = getText(x.select('.main-bd > .action > .up > span')).strip()\n",
    "        rv_action_disagree = getText(x.select('.main-bd > .action > .down > span')).strip()\n",
    "        result.append({\n",
    "            'rv_name': rv_name,\n",
    "            'rv_time': rv_time,\n",
    "            'rv_info': rv_info,\n",
    "            'rv_mv_id': mv_id,\n",
    "            'rv_action_agree': rv_action_agree,\n",
    "            'rv_action_disagree' : rv_action_disagree\n",
    "        })\n",
    "    # 数据预处理， 填补空值\n",
    "    result = pd.DataFrame(data = result)\n",
    "    result['rv_action_agree'] = result['rv_action_agree'].apply(lambda x : 0 if x == '' else x)\n",
    "    result['rv_action_disagree'] = result['rv_action_disagree'].apply(lambda x : 0 if x == '' else x)\n",
    "    return result\n",
    "\n",
    "'''\n",
    "    获取电影的评分信息\n",
    "'''\n",
    "def getStar(url) -> list:\n",
    "    result = {}\n",
    "    # 获取网页响应的结果文件\n",
    "    html = requests.get(url, headers=headers).text\n",
    "    # 解析HTML网页内容\n",
    "    soup = BeautifulSoup(html)\n",
    "    for i in range(1, 6):\n",
    "        star = soup.select('.droplist > .rating' + str(i))\n",
    "        star = re.search('\\(([0-9]+)\\)', str(star)).group(1)\n",
    "        result[i] = int(star)\n",
    "    return result\n",
    "\n",
    "'''\n",
    "    根据豆瓣电影ID爬取内容\n",
    "'''\n",
    "i = 0\n",
    "def crawlerById(id, name) -> pd.DataFrame:\n",
    "    global mv_stars, i\n",
    "    # 记录时间\n",
    "    startTime = time.time()\n",
    "    # 默认按热度排序\n",
    "    url_by_hot = 'https://movie.douban.com/subject/' + str(id) + '/reviews?start='\n",
    "    # 按星级排序\n",
    "    url_by_star = [url_by_hot + '?rating='  + str(i)+'?start=' for i in range(1, 6)]\n",
    "    # 拼接待爬取的url，取前 40页评论\n",
    "    urls = []\n",
    "    for i in range(0, 41, 20):\n",
    "        urls.append(url_by_hot + str(i))\n",
    "        for url in url_by_star:\n",
    "            urls.append(url + str(i))\n",
    "    print('-' * 50 + 'BEGIN:' + str(i) + '-' * 50)\n",
    "    print(f'[INFO] >> 正在爬取 id = {id}, 名称={name}, url={url_by_hot} 的影评... \\n')\n",
    "    mv_reviews = {}\n",
    "    # 爬取评论\n",
    "    mv_reviews['reviews'] = pd.concat([getInfo(url, id) for url in urls])\n",
    "    # 爬取电影总评分情况\n",
    "    mv_reviews['star'] = {}\n",
    "    mv_reviews['star']['id'] = id\n",
    "    mv_reviews['star']['name'] = name\n",
    "    mv_reviews['star'].update(getStar(urls[0]))\n",
    "    # 将评论的结果保存到本地\n",
    "    save_path_reviews = './data/reviews-' + str(name) + '.xlsx'\n",
    "    mv_reviews['reviews'].to_excel(save_path_reviews, index=False)\n",
    "    print(f'[INFO] >> 爬取的评论结果保存到了当前文件夹的: {save_path_reviews}')\n",
    "    # 将影评的评分记录到mv_stars \n",
    "    mv_stars.append(mv_reviews['star'])\n",
    "    print(f'[INFO] >> 为防止反爬虫机制启动，睡眠 1 s')\n",
    "    time.sleep(1)\n",
    "    endTime = time.time()\n",
    "    print('-' * 35 + 'END:' + str(i) + ' 爬取完毕, 本次爬取耗时:' + f'{endTime - startTime:.3f}' + ' s' + '-' * 35)\n",
    "    i += 1\n",
    "    return mv_reviews\n",
    "\n",
    "print('=' * 50 + '【启动爬虫程序】' + '=' * 50)\n",
    "startTime = time.time()\n",
    "\n",
    "list_mv = []\n",
    "# 爬取之前电影的评论信息\n",
    "for (i, mv) in mv_data.iterrows():\n",
    "    list_mv.append(crawlerById(mv.mv_id, mv.mv_name))\n",
    "# 将所有影视评论分布导出到excel\n",
    "mv_stars = pd.DataFrame(data = mv_stars)\n",
    "mv_stars.to_excel('./data/mv_stars.xlsx')\n",
    "print(f'[INFO] >> 所有影评的评分情况已保存到本地的文件: ./mv_stars.xlsx中')\n",
    "\n",
    "endTime = time.time()\n",
    "print('=' * 45 + '程序执行完毕，总耗时: ' + f'{endTime - startTime:.3f}' + ' s' + '=' * 45)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.3 整理爬取的数据"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>reviews</th>\n",
       "      <th>star</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>rv_name              rv_time  \\\n",
       "0      大头绿...</td>\n",
       "      <td>{'id': '1292052', 'name': '肖申克的救赎', 1: 32, 2: ...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>rv_name              rv_time  \\\n",
       "0   ...</td>\n",
       "      <td>{'id': '1291546', 'name': '霸王别姬', 1: 23, 2: 19...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>rv_name              rv_time  \\\n",
       "0     kino ...</td>\n",
       "      <td>{'id': '1292720', 'name': '阿甘正传', 1: 28, 2: 18...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>rv_name              rv_time  \\\n",
       "0   waff...</td>\n",
       "      <td>{'id': '1292722', 'name': '泰坦尼克号', 1: 16, 2: 1...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>rv_name              rv_time  \\\n",
       "0      ...</td>\n",
       "      <td>{'id': '1295644', 'name': '这个杀手不太冷', 1: 14, 2:...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>rv_name              rv_time  \\\n",
       "0    小隐隐于浆...</td>\n",
       "      <td>{'id': '1292063', 'name': '美丽人生', 1: 12, 2: 15...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>rv_name              rv_time  \\\n",
       "0      le...</td>\n",
       "      <td>{'id': '1291561', 'name': '千与千寻', 1: 16, 2: 11...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>rv_name              rv_time  \\\n",
       "0    cxybo...</td>\n",
       "      <td>{'id': '1295124', 'name': '辛德勒的名单', 1: 5, 2: 3...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>rv_name              rv_time  \\\n",
       "0 ...</td>\n",
       "      <td>{'id': '3541415', 'name': '盗梦空间', 1: 37, 2: 33...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>rv_name              rv_time  \\\n",
       "0       暖...</td>\n",
       "      <td>{'id': '3011091', 'name': '忠犬八公的故事', 1: 12, 2:...</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                             reviews  \\\n",
       "0      rv_name              rv_time  \\\n",
       "0      大头绿...   \n",
       "1            rv_name              rv_time  \\\n",
       "0   ...   \n",
       "2     rv_name              rv_time  \\\n",
       "0     kino ...   \n",
       "3        rv_name              rv_time  \\\n",
       "0   waff...   \n",
       "4         rv_name              rv_time  \\\n",
       "0      ...   \n",
       "5      rv_name              rv_time  \\\n",
       "0    小隐隐于浆...   \n",
       "6       rv_name              rv_time  \\\n",
       "0      le...   \n",
       "7      rv_name              rv_time  \\\n",
       "0    cxybo...   \n",
       "8              rv_name              rv_time  \\\n",
       "0 ...   \n",
       "9       rv_name              rv_time  \\\n",
       "0       暖...   \n",
       "\n",
       "                                                star  \n",
       "0  {'id': '1292052', 'name': '肖申克的救赎', 1: 32, 2: ...  \n",
       "1  {'id': '1291546', 'name': '霸王别姬', 1: 23, 2: 19...  \n",
       "2  {'id': '1292720', 'name': '阿甘正传', 1: 28, 2: 18...  \n",
       "3  {'id': '1292722', 'name': '泰坦尼克号', 1: 16, 2: 1...  \n",
       "4  {'id': '1295644', 'name': '这个杀手不太冷', 1: 14, 2:...  \n",
       "5  {'id': '1292063', 'name': '美丽人生', 1: 12, 2: 15...  \n",
       "6  {'id': '1291561', 'name': '千与千寻', 1: 16, 2: 11...  \n",
       "7  {'id': '1295124', 'name': '辛德勒的名单', 1: 5, 2: 3...  \n",
       "8  {'id': '3541415', 'name': '盗梦空间', 1: 37, 2: 33...  \n",
       "9  {'id': '3011091', 'name': '忠犬八公的故事', 1: 12, 2:...  "
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_mv = pd.DataFrame(data = list_mv)\n",
    "df_mv\n",
    "# list_mv[0]['reviews'].head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'id': '1292052', 'name': '肖申克的救赎', 1: 32, 2: 24, 3: 195, 4: 1693, 5: 8836}"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list_mv[0]['star']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 二、MongoDB 操作豆瓣影评数据集"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.1 创建 MongoDB 连接实例"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Collection(Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True), 'mv'), 'dc_mv_review')"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from pymongo import MongoClient\n",
    "from random import randint\n",
    "client = MongoClient('localhost', 27017)\n",
    "\n",
    "db = client.mv\n",
    "# 创建电影信息集合\n",
    "ct_mv_info = db.dc_mv_info\n",
    "# 创建影评集合\n",
    "ct_mv_review = db.dc_mv_review\n",
    "\n",
    "# 查看创建结果\n",
    "ct_mv_review\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.2 向 MongoDB 集合插入文档\n",
    "\n",
    "这里先将DataFrame的影视信息转化为dict字典格式"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[{'_id': '1292052',\n",
       "  'mv_id': '1292052',\n",
       "  'mv_rank': 1,\n",
       "  'mv_name': '肖申克的救赎',\n",
       "  'mv_href': 'https://movie.douban.com/subject/1292052/',\n",
       "  'mv_actors': '导演: 弗兰克·德拉邦特 Frank Darabont主演: 蒂姆·罗宾斯 Tim Robbins /...',\n",
       "  'mv_type': '1994/美国/犯罪 剧情',\n",
       "  'mv_review': '希望让人自由。',\n",
       "  'mv_star': '9.7',\n",
       "  'mv_evaNum': '2611849'},\n",
       " {'_id': '1291546',\n",
       "  'mv_id': '1291546',\n",
       "  'mv_rank': 2,\n",
       "  'mv_name': '霸王别姬',\n",
       "  'mv_href': 'https://movie.douban.com/subject/1291546/',\n",
       "  'mv_actors': '导演: 陈凯歌 Kaige Chen主演: 张国荣 Leslie Cheung / 张丰毅 Fengyi Zha...',\n",
       "  'mv_type': '1993/中国大陆 中国香港/剧情 爱情 同性',\n",
       "  'mv_review': '风华绝代。',\n",
       "  'mv_star': '9.6',\n",
       "  'mv_evaNum': '1939306'},\n",
       " {'_id': '1292720',\n",
       "  'mv_id': '1292720',\n",
       "  'mv_rank': 3,\n",
       "  'mv_name': '阿甘正传',\n",
       "  'mv_href': 'https://movie.douban.com/subject/1292720/',\n",
       "  'mv_actors': '导演: 罗伯特·泽米吉斯 Robert Zemeckis主演: 汤姆·汉克斯 Tom Hanks / ...',\n",
       "  'mv_type': '1994/美国/剧情 爱情',\n",
       "  'mv_review': '一部美国近现代史。',\n",
       "  'mv_star': '9.5',\n",
       "  'mv_evaNum': '1963347'},\n",
       " {'_id': '1292722',\n",
       "  'mv_id': '1292722',\n",
       "  'mv_rank': 4,\n",
       "  'mv_name': '泰坦尼克号',\n",
       "  'mv_href': 'https://movie.douban.com/subject/1292722/',\n",
       "  'mv_actors': '导演: 詹姆斯·卡梅隆 James Cameron主演: 莱昂纳多·迪卡普里奥 Leonardo...',\n",
       "  'mv_type': '1997/美国 墨西哥 澳大利亚 加拿大/剧情 爱情 灾难',\n",
       "  'mv_review': '失去的才是永恒的。 ',\n",
       "  'mv_star': '9.4',\n",
       "  'mv_evaNum': '1923794'},\n",
       " {'_id': '1295644',\n",
       "  'mv_id': '1295644',\n",
       "  'mv_rank': 5,\n",
       "  'mv_name': '这个杀手不太冷',\n",
       "  'mv_href': 'https://movie.douban.com/subject/1295644/',\n",
       "  'mv_actors': '导演: 吕克·贝松 Luc Besson主演: 让·雷诺 Jean Reno / 娜塔莉·波特曼 ...',\n",
       "  'mv_type': '1994/法国 美国/剧情 动作 犯罪',\n",
       "  'mv_review': '怪蜀黍和小萝莉不得不说的故事。',\n",
       "  'mv_star': '9.4',\n",
       "  'mv_evaNum': '2115656'},\n",
       " {'_id': '1292063',\n",
       "  'mv_id': '1292063',\n",
       "  'mv_rank': 6,\n",
       "  'mv_name': '美丽人生',\n",
       "  'mv_href': 'https://movie.douban.com/subject/1292063/',\n",
       "  'mv_actors': '导演: 罗伯托·贝尼尼 Roberto Benigni主演: 罗伯托·贝尼尼 Roberto Beni...',\n",
       "  'mv_type': '1997/意大利/剧情 喜剧 爱情 战争',\n",
       "  'mv_review': '最美的谎言。',\n",
       "  'mv_star': '9.6',\n",
       "  'mv_evaNum': '1206493'},\n",
       " {'_id': '1291561',\n",
       "  'mv_id': '1291561',\n",
       "  'mv_rank': 7,\n",
       "  'mv_name': '千与千寻',\n",
       "  'mv_href': 'https://movie.douban.com/subject/1291561/',\n",
       "  'mv_actors': '导演: 宫崎骏 Hayao Miyazaki主演: 柊瑠美 Rumi Hîragi / 入野自由 Miy...',\n",
       "  'mv_type': '2001/日本/剧情 动画 奇幻',\n",
       "  'mv_review': '最好的宫崎骏，最好的久石让。 ',\n",
       "  'mv_star': '9.4',\n",
       "  'mv_evaNum': '2040396'},\n",
       " {'_id': '1295124',\n",
       "  'mv_id': '1295124',\n",
       "  'mv_rank': 8,\n",
       "  'mv_name': '辛德勒的名单',\n",
       "  'mv_href': 'https://movie.douban.com/subject/1295124/',\n",
       "  'mv_actors': '导演: 史蒂文·斯皮尔伯格 Steven Spielberg主演: 连姆·尼森 Liam Neeson...',\n",
       "  'mv_type': '1993/美国/剧情 历史 战争',\n",
       "  'mv_review': '拯救一个人，就是拯救整个世界。',\n",
       "  'mv_star': '9.6',\n",
       "  'mv_evaNum': '1006347'},\n",
       " {'_id': '3541415',\n",
       "  'mv_id': '3541415',\n",
       "  'mv_rank': 9,\n",
       "  'mv_name': '盗梦空间',\n",
       "  'mv_href': 'https://movie.douban.com/subject/3541415/',\n",
       "  'mv_actors': '导演: 克里斯托弗·诺兰 Christopher Nolan主演: 莱昂纳多·迪卡普里奥 Le...',\n",
       "  'mv_type': '2010/美国 英国/剧情 科幻 悬疑 冒险',\n",
       "  'mv_review': '诺兰给了我们一场无法盗取的梦。',\n",
       "  'mv_star': '9.4',\n",
       "  'mv_evaNum': '1882081'},\n",
       " {'_id': '3011091',\n",
       "  'mv_id': '3011091',\n",
       "  'mv_rank': 10,\n",
       "  'mv_name': '忠犬八公的故事',\n",
       "  'mv_href': 'https://movie.douban.com/subject/3011091/',\n",
       "  'mv_actors': '导演: 莱塞·霍尔斯道姆 Lasse Hallström主演: 理查·基尔 Richard Ger...',\n",
       "  'mv_type': '2009/美国 英国/剧情',\n",
       "  'mv_review': '永远都不能忘记你所爱的人。',\n",
       "  'mv_star': '9.4',\n",
       "  'mv_evaNum': '1287928'}]"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dc_mv = []\n",
    "index = 0 \n",
    "for x in mv_data.values.tolist():\n",
    "    dict_info = {}\n",
    "    # 指定文档的_id为电影ID\n",
    "    dict_info['_id'] = mv_data['mv_id'][index]\n",
    "    index += 1\n",
    "    # i 用于循环遍历取DF列表数据\n",
    "    i = 0\n",
    "    for key, v in mv_data.items():\n",
    "        dict_info[key] = x[i]\n",
    "        i += 1\n",
    "    # 指定文档的\n",
    "    dc_mv.append(dict_info)\n",
    "dc_mv"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<pymongo.results.InsertManyResult at 0x2114865bc80>"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 插入前 先清空\n",
    "ct_mv_info.delete_many({})\n",
    "# 插入文档\n",
    "ct_mv_info.insert_many(dc_mv)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.3 查看插入到MongoDB的数据"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'_id': '1292052',\n",
       " 'mv_id': '1292052',\n",
       " 'mv_rank': 1,\n",
       " 'mv_name': '肖申克的救赎',\n",
       " 'mv_href': 'https://movie.douban.com/subject/1292052/',\n",
       " 'mv_actors': '导演: 弗兰克·德拉邦特 Frank Darabont主演: 蒂姆·罗宾斯 Tim Robbins /...',\n",
       " 'mv_type': '1994/美国/犯罪 剧情',\n",
       " 'mv_review': '希望让人自由。',\n",
       " 'mv_star': '9.7',\n",
       " 'mv_evaNum': '2611849'}"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ct_mv_info.find_one()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.4 同样的操作插入影评\n",
    "\n",
    "先处理信息，将原先的DataFrame的影评信息转化为可插入到MongoDB的dict字典"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[INFO] >> 共获取到 10 个电影 3432 个的影评\n",
      "[INFO] >> 查看其中的一个影评： [{'_id': '12920520', 'rv_name': '大头绿豆', 'rv_time': '2005-05-12 20:44:13', 'rv_info': '距离斯蒂芬·金（Stephen King）和德拉邦特（Frank Darabont）们缔造这部伟大的作品已经有十年了。我知道美好的东西想必大家都能感受，但是很抱歉，我的聒噪仍将一如既往。 在我眼里，肖申克的救赎与信念、自由和友谊有关。 ［1］信 念 瑞德（Red）说，希望是危险的东西，是精...', 'rv_mv_id': '1292052', 'rv_action_agree': '17526', 'rv_action_disagree': '708'}]\n"
     ]
    }
   ],
   "source": [
    "# list_mv[1] 输出结果 dict_keys(['reviews', 'star'])\n",
    "# 查询保存的列表数据\n",
    "# for x in list_mv[1].values.to_list():\n",
    "#     print(x)\n",
    "'''\n",
    "    根据之前的存储信息获取所有电影的影评, 封装成可插入MongoDB的 dict\n",
    "'''\n",
    "def getAllReviews() -> list[list]:\n",
    "    index = 0\n",
    "    reviews = []\n",
    "    for i in range(len(list_mv)):\n",
    "        # 获取每一列\n",
    "        rv_cols = list_mv[0]['reviews'].columns\n",
    "        # 表示当前的评论标号\n",
    "        i = 0\n",
    "        # 记录当前电影的所有影评信息\n",
    "        dc_reviews = []\n",
    "        for k, rows in list_mv[index]['reviews'].iterrows():\n",
    "            # 根据电影ID和当前的评论序号定义_id\n",
    "            dict_info = {'_id' : mv_data['mv_id'][index] + str(i)}\n",
    "            i += 1\n",
    "            for col in rv_cols:\n",
    "                dict_info[col] = rows[col]\n",
    "            dc_reviews.append(dict_info)\n",
    "        index += 1\n",
    "        reviews.append(dc_reviews)\n",
    "    return reviews\n",
    "\n",
    "# 获取Top10电影的爬取到的所有影评\n",
    "dc_reviews = getAllReviews()\n",
    "count = 0\n",
    "for i in range(len(dc_reviews)):\n",
    "    count += len(dc_reviews[i])\n",
    "print(f'[INFO] >> 共获取到 {len(dc_reviews)} 个电影 {count} 个的影评')\n",
    "\n",
    "print(f'[INFO] >> 查看其中的一个影评： {dc_reviews[0][:1]}')\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.5 插入影评信息到 MongoDB"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [],
   "source": [
    "ct_mv_review.delete_many({})\n",
    "for rv in dc_reviews:\n",
    "    ct_mv_review.insert_many(rv)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'_id': '12920520',\n",
       " 'rv_name': '大头绿豆',\n",
       " 'rv_time': '2005-05-12 20:44:13',\n",
       " 'rv_info': '距离斯蒂芬·金（Stephen King）和德拉邦特（Frank Darabont）们缔造这部伟大的作品已经有十年了。我知道美好的东西想必大家都能感受，但是很抱歉，我的聒噪仍将一如既往。 在我眼里，肖申克的救赎与信念、自由和友谊有关。 ［1］信 念 瑞德（Red）说，希望是危险的东西，是精...',\n",
       " 'rv_mv_id': '1292052',\n",
       " 'rv_action_agree': '17526',\n",
       " 'rv_action_disagree': '708'}"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 查看插入结果\n",
    "ct_mv_review.find_one()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 三、MongoDB 实战\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3.1 计算豆瓣 Top 10 影视的平均评分"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'_id': '1292052',\n",
       " 'mv_id': '1292052',\n",
       " 'mv_rank': 1,\n",
       " 'mv_name': '肖申克的救赎',\n",
       " 'mv_href': 'https://movie.douban.com/subject/1292052/',\n",
       " 'mv_actors': '导演: 弗兰克·德拉邦特 Frank Darabont主演: 蒂姆·罗宾斯 Tim Robbins /...',\n",
       " 'mv_type': '1994/美国/犯罪 剧情',\n",
       " 'mv_review': '希望让人自由。',\n",
       " 'mv_star': '9.7',\n",
       " 'mv_evaNum': '2611849'}"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ct_mv_info.find_one()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Collection(Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True), 'mv'), 'mv_star_avg')"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from bson.code import Code\n",
    "\n",
    "mapper = Code(\"\"\"function(){\n",
    "       emit('', {count:1, mv_star: eval(this.mv_star)});\n",
    "    }\n",
    "\"\"\")\n",
    "reducer = Code(\"\"\"function(k, v) {\n",
    "        reducedVal = {count: 0, mv_star: 0};\n",
    "        for (var idx = 0; idx < v.length; idx++) {\n",
    "            reducedVal.count += v[idx].count;\n",
    "            reducedVal.mv_star += v[idx].mv_star;\n",
    "        }\n",
    "        return reducedVal;\n",
    "    };\n",
    "\"\"\")\n",
    "\n",
    "finalizer = Code(\"\"\"\n",
    "        reducedVal.mv_star_avg = reducedVal.mv_star/reducedVal.count;\n",
    "        return reducedVal;\n",
    "\"\"\")\n",
    "res = ct_mv_info.map_reduce(map = mapper,reduce=reducer,out='mv_star_avg', finalize = finalizer)\n",
    "res\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'_id': '', 'value': {'count': 10.0, 'mv_star': 95.0, 'mv_star_avg': 9.5}}\n"
     ]
    }
   ],
   "source": [
    "ct_mv_star_avg = db.mv_star_avg\n",
    "for x in ct_mv_star_avg.find():\n",
    "    print(x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "根据结果得出，10部电影里的平均评分为9.5分，还是处于相当高的水平，当然这个案例并没有实际意义，主要是为了熟悉MongoDB的MapReduce的基本使用"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3.2 统计Top10电影影评的[赞同 / 不赞同]的平均比率"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'_id': '12920520',\n",
       " 'rv_name': '大头绿豆',\n",
       " 'rv_time': '2005-05-12 20:44:13',\n",
       " 'rv_info': '距离斯蒂芬·金（Stephen King）和德拉邦特（Frank Darabont）们缔造这部伟大的作品已经有十年了。我知道美好的东西想必大家都能感受，但是很抱歉，我的聒噪仍将一如既往。 在我眼里，肖申克的救赎与信念、自由和友谊有关。 ［1］信 念 瑞德（Red）说，希望是危险的东西，是精...',\n",
       " 'rv_mv_id': '1292052',\n",
       " 'rv_action_agree': '17526',\n",
       " 'rv_action_disagree': '708'}"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ct_mv_review.find_one()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Collection(Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True), 'mv'), 'mv_agree_divide_disagree_rate')"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from bson.code import Code\n",
    "\n",
    "mapper = Code(\"\"\"function(){\n",
    "       var a = eval(this.rv_action_agree);\n",
    "       var b = eval(this.rv_action_disagree);\n",
    "       var rate = 0;\n",
    "       if(a > 0 && b > 0){\n",
    "           rate = b / (a + b);\n",
    "           emit(this.rv_mv_id, {count: 1, rate: rate});\n",
    "       } \n",
    "    }\n",
    "\"\"\")\n",
    "reducer = Code(\"\"\"function(k, v) {\n",
    "        reducedVal = {count: 0, rate: 0};\n",
    "        for (var i = 0; i < v.length; i++) {\n",
    "            reducedVal.count += v[i].count;\n",
    "            reducedVal.rate += v[i].rate;\n",
    "        }\n",
    "        return reducedVal;\n",
    "    };\n",
    "\"\"\")\n",
    "\n",
    "finalizer = Code(\"\"\"\n",
    "        reducedVal.rate = reducedVal.rate/reducedVal.count;\n",
    "        return reducedVal;\n",
    "\"\"\")\n",
    "res = ct_mv_review.map_reduce(map = mapper,reduce=reducer,out='mv_agree_divide_disagree_rate', finalize = finalizer)\n",
    "res\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'_id': '1291561', 'value': {'count': 311.0, 'rate': 0.06123543891335884}}\n",
      "{'_id': '1292720', 'value': {'count': 344.0, 'rate': 0.08751263808581845}}\n",
      "{'_id': '1292063', 'value': {'count': 320.0, 'rate': 0.06863933789127512}}\n",
      "{'_id': '1295644', 'value': {'count': 339.0, 'rate': 0.0766028318668929}}\n",
      "{'_id': '3011091', 'value': {'count': 256.0, 'rate': 0.06715387472281786}}\n",
      "{'_id': '3541415', 'value': {'count': 335.0, 'rate': 0.11069258491308316}}\n",
      "{'_id': '1292052', 'value': {'count': 293.0, 'rate': 0.08145591398735749}}\n",
      "{'_id': '1291546', 'value': {'count': 318.0, 'rate': 0.06597528006880007}}\n",
      "{'_id': '1292722', 'value': {'count': 306.0, 'rate': 0.0696867605874007}}\n",
      "{'_id': '1295124', 'value': {'count': 326.0, 'rate': 0.0720808131916883}}\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "10"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ct_mv_agree_divide_disagree_rate = db.mv_agree_divide_disagree_rate\n",
    "\n",
    "# 查询mongodb文档数据并持久化\n",
    "temp1 = []\n",
    "for x in ct_mv_agree_divide_disagree_rate.find():\n",
    "    print(x)\n",
    "    temp1.append(x)\n",
    "len(temp1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[{'_id': '1291561',\n",
       "  'value': {'count': 311.0, 'rate': 0.06123543891335884},\n",
       "  'mv_name': '千与千寻'},\n",
       " {'_id': '1292720',\n",
       "  'value': {'count': 344.0, 'rate': 0.08751263808581845},\n",
       "  'mv_name': '阿甘正传'},\n",
       " {'_id': '1292063',\n",
       "  'value': {'count': 320.0, 'rate': 0.06863933789127512},\n",
       "  'mv_name': '美丽人生'},\n",
       " {'_id': '1295644',\n",
       "  'value': {'count': 339.0, 'rate': 0.0766028318668929},\n",
       "  'mv_name': '这个杀手不太冷'},\n",
       " {'_id': '3011091',\n",
       "  'value': {'count': 256.0, 'rate': 0.06715387472281786},\n",
       "  'mv_name': '忠犬八公的故事'},\n",
       " {'_id': '3541415',\n",
       "  'value': {'count': 335.0, 'rate': 0.11069258491308316},\n",
       "  'mv_name': '盗梦空间'},\n",
       " {'_id': '1292052',\n",
       "  'value': {'count': 293.0, 'rate': 0.08145591398735749},\n",
       "  'mv_name': '肖申克的救赎'},\n",
       " {'_id': '1291546',\n",
       "  'value': {'count': 318.0, 'rate': 0.06597528006880007},\n",
       "  'mv_name': '霸王别姬'},\n",
       " {'_id': '1292722',\n",
       "  'value': {'count': 306.0, 'rate': 0.0696867605874007},\n",
       "  'mv_name': '泰坦尼克号'},\n",
       " {'_id': '1295124',\n",
       "  'value': {'count': 326.0, 'rate': 0.0720808131916883},\n",
       "  'mv_name': '辛德勒的名单'}]"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 准备电影ID到电影名的映射字典\n",
    "dict_rv_name = {}\n",
    "for mv in ct_mv_info.find():\n",
    "    dict_rv_name[mv['mv_id']] = mv['mv_name'] \n",
    "# 条件查询, 根据电影 ID 获取到对应的电影名\n",
    "for i in range(len(temp1)):\n",
    "    id = temp1[i]['_id']\n",
    "    name = ''\n",
    "    for x in ct_mv_info.find({'_id': id}):\n",
    "        name = x['mv_name']\n",
    "    temp1[i]['mv_name'] = name\n",
    "temp1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYwAAAFDCAYAAAAphzkrAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/MnkTPAAAACXBIWXMAAAsTAAALEwEAmpwYAABbi0lEQVR4nO2dd5hU1fnHP1+WJlVUBDv2GrGgQhTF3mPvXWPBHltMxIxj7C2x915ir7FrxBLBiEZi+ylGUbGghiao1Pf3x3uGHYbZ3dndmS3wfp5nn52599x7zr1z73nrOUdmRhAEQRDURZvmbkAQBEHQOgiBEQRBEJRECIwgCIKgJEJgBEEQBCURAiMIgiAoiRAYQRAEQUmEwGgFKKsqZdVRWamEsh2UVbu8771KOKZPI5vY4lBWSxfeL2W1jLL6SwPPt76y2iDve0dlVVWkXJv0G8S7VWaUVa+4r81L2+ZuQFASA4BXgVnKqraBM8KVgOOAq5RVW+AVZbWHZew/swv5SzcQ+C8wC/instrRMvZ2xa6gTCirN4FFgR/zNvcAhlvGdktlugP/BK4ELsorNwM4Wln9xzJ2q7JaDPgamFhQTRdgF8vYE3nbNgbOUFa/tox9CIwGeilbowxfDxiR2rMS8IllbFbBtbQFXgeOKuXeK6ttgXctY2OU1VrA4ZaxY0o47v+AEy1jz+RtWxz4CmhvGZtewjlWAI4FbrSMvV9H2VHAxZaxGwq2zwS2z29HDcf/FXjUMjY0b1tb/De9ELixhuN2A7pYxm4vsu8PwEaWse2L7FsJ+Ii5n4NiVAG/WMZ6llB2niMERuvgdaCdZWxGXQWT1qtkZXQDssDyyupbYCdgC2ANYGlgL8vYU8rqCGAtoMULDGA6cIJl7NHcBmV1FC5Uc1wJvAxckn+gZewrZXUHsGTa9EvavmB+OWU1AphacOwlymp54C5gXWDZ1JaZwDHA7sCmuMBuD0zLO/wmYJKy2scyli/odsQFy87Uce+VVVfgbuBs4K/Al8D+yuoxy9hztR0LjAOmpk63KrXtF2BWTlik56adZeyXGs6xKHAC8CRQq8BI5y52nmnMeV/mQlmtht/Pu9P3TsBUy9gMZfVnoGfaLqAjMM0yNjOv3ruV1RTL2INF2vRzDdXmBOYiuXdMWR0AfG0Ze7GgfRsB99Z2DfMyITBaAUkznVVnQS87E0BZrQFcTfXLcBKuUf4D+D9gM2AdZXUB/hK3UVa/T9u3BG4tcvrbLWMHN/xKqlFW++Ia8qYF25fBNcgBuIZ+kGXsi4LDb0haaI6uwN/T8acAfYAt8jV6ZbUALgROtoxNSpun5+1fEhhvGZuSNhW73yemcwPsD7xmGfswZ2VYxkxZLQ38CTgTGJPKbgf8DbfktrWMfZW2/wH4PXCMsrqvDs39eGACcE2q63/K6hLgRmX1q7xryl1PO2ADXNmYBhiwG3Ab1Z12G2U1IX2uAm4ATq6h/tz9aKusuhTsawNUWcbGp+8zKX7/5hAY6RndxTL257wyZwJXASOVVWfgHdySs1TPTGU1BLemO+L39sV0T55UVifgQuMLy9i/8s5bm2U+h4WVru8S4OHcuQuYWWTbfEEIjFZC0jAn1VkQeljGJljG3lNWt+Ca2sp4Z9cJf3Fmpr9rgB8sY9clLf1b3EVzD/BoOt94YG/gWerQDutxLXsCtwDDC7a3xTXYMcDauOb9sLJaP6/zbwuMBD7PO3QVoIuy2jsds4NlbJqyWsgyNi6VeRW/DzNTh3AOc1ogxwKDgP5F2ruAZexny9hU3HUBbk1sCBxcUHwX3HKY7SqyjE1O7pIz8U4/ZxW1By7F3Vv3K6v+BRZIrv4VgTOAgy1j+b/BRcBewH3KaqeCfe1w5aBDXjvuA+5L51wQ/+0XLKyvDp6qYftbQL86jp3daSurnYDbgQnK6irL2HhltRV+/9cAjgZOAX6V9u2B378Ni92j2RVk7Prk+vp3yVfkrsp8/gwsCOyjrPYBFsDfjZULr2N+IwRG6yHXGfwa+LDI/hWBfzFnp/5rYDFcg/oKWAHXJGekbVNIPn3gIFz7NvI0waQ9T7GMTSjHRSirTah2qxR2zlvjL+U2lrExwCXK6tB0Ha+lMpuka1gVd/VsjAu/GZaxX5TVQ5ax6cpqB+AmZbWyZWyiZaxfqn9x4GPgzoK6pwKjirS3C/BjXqxi2+SDvwKPD51WcMjBwLWWsZ/S8T2p1mAzlrFZympl4Fxgq2QR3q+sNgWeVVbb5FsLyVK4HXjRMnZ/fkWWsanKahdgKPB8cnl9nXb/DPycrB6Am5XVCMvYXoXXWE82zY8tpDa2wQVUPospq1Xw36odbtUCLKms7gZ2BS4DzrWM/ZSsifvw2NTzwK/wuMt4eeLGxcARlrEfk2Igy9jf8tqwFtUd/xNA+6Rk5d6H9kBVnnXUFnfJFVpmOwGDccVgk1TuLeCw9HzV41bNe4TAaD3kOp0fi3XeyurH/HLJJ30esDz+su2Kd1LCO9ifge64q6oz/lKXpJUl//EJuJukG26NnGQZm6SsDsZfuNG4ABiDC6Jn0+GjgHWAPZlbYKwDvJ+ERY5huGvlteQ+mQb8hLsnuuHWBkBnZTXeMrZS6qhuAva2jBUGMq8HnrCM/Td1KDnaAGPzLzP9n4IL3Wm4e2QqgGVsuLL6ELducvdleWAZ4Nq884zC7zPAKqnOx3EXVee8csfilt0IZXWEZWxo6ohvweMlpyqrZZlbu50OHJCu9z1l9TvL2O1JUOSXPSydczr+2xvegU5I++8uJYBejGT9TS3Y/Gcgg/cx7YC+afvtuLBeyTL2Zd45piir/ni85VLgC8vYzcpqEHBdOtfw5LLsClytrCbnJSbkhEdv/P5PxBWT3HvTDhde36bvSm05uqDdv8XdjWviSspM4HzL2Av1uCXzLCEwWg+FGlxt5aYDq+OdSC/8oX+U6mDlTNw1MgTX3pcAJgOvK6uNazP5E4NxF8m+ePD1RjwY/Ju0f33c97smcCDwSNL0v8xpwDVoat3xzK18JpCC1JaxBZXVQXgHXsjHlrGHldX6wAPAoUU04QuAHYC/JBfHS3m7lwTezfvePtVppE5GWc0CpietvwpYxzI2U1mtmcr+V1n1Bqbl3FhAr2QJGB54PQd3JS0MHAG8kurbAg+mfkh1UP4w/J7uT7WFVRPLp/OOrqNcL9ydt1guXqOsXsJjHbNRVh3xGEGOnGbeJbmz8plUmAEG/NYydlfBOcFjFkXdWpaxj5TV1rjVmBMw2+BW523A5bjg/hl4DLhTWS1rGRtvGVs11XEbHiS/EM+oytV9Ip4ltXuxuvPasGOKZw3EA/1VwMrKavW6ssPmB0JgtB7a4R39O7WYxTPxju4n8zTa9VOAMBeneBAXJjPwWMXjlrHjldVVeFrqXTWduIBjgUstY88DKKvBwLtJAyade0hyv5wLHIl31NcWPVs1M5g7u+YnXGvMcSQej3gzb9uWwGrpvlwF7GcZe0meNjsF72BuwAXcJNzquQ6PQTyUztEPdzEtgAvXr6mZ/UhJAfm/RYFG/xGwSop75JgJ7Jm0/yHADGW1CK4J753anklCCsvYjcrqWVwod8A7rx+BAZaxN1OdmwBPWcY+xTO1asUyNk6eZrszHhxeAc/62qmg6Bm4QlHIE0W2LQ58U1fdiRrjcMpqM9zKegP4a7LYDsHjcH/BlZIl8fu6v7JaIy/Qnk9Dg9KLKqsncOF7M55h+DxuSb+erLEjG3jueYIQGK2EFDjtiGudN+Q6FQBltQSu9V+ac1cln/BvgNVwH/xvcI1pLO42eRdYVz6uIXeeDgUdXE0sA3yS9z33uU/6/0VO40yd4xiKWwWF/ICnmebTnTndHVV4J5Lvi++KWzSPAENTp9gZD6D/wzJ2SnKjbYnHHv4FrG4Z+xbYPQWV2wJHAWMtY+fU0c7n8QSCX9L1HQvsbhkbBLPdge2LHZj/u6X2vI4Ln+WKuRrzMsSmKasBuFB9L6/IkkBhFlkxVlBW6+Kum6uBrLJ6GHdX3ljoy8cF9eeWsT41nTC5i17CrdNysBpu/XyNW1oPWsb+m+rqQIGwsYy9N9cZqtvWFuhoGatP277Ds8TewIPe9wDn45ZNbzwG2L2mg+cHQmC0LpbAM3va4dpojkNx32t+1k973C21PO7mWRV3K8zCO53XcC17W1y73h44TVnVmoWSGI0H2XPkPn+GC5NllVVVcte0Sdu+om6GAWcqq45WPR6gH+5+yLE53hH8oKxmpP2zrHpg4rgUJ3gEty7OBLCUDqzqFNhv8855Ct4xHE3dmWDr4VbJZjW5KFIgu2jOv7LK4rEQgP8AO5copMF/5+eSqyvHUhQRGPL03g7K6iZ8jM1yuFD9CbgDF45DcSXioGKXUWKbai2bBPfxVFtyNZ8kY1eR91wrq5z76yfcRfUuLlTqQrg1W4Xfs5JJcZ6OuODIproXTvf8Xfk4jPmWGGbfirCMfY7n+J+vrBaF2Vk8xwF/yNdQk193CC4gzsYD2u/jSsLiuM98Mdy9dD9wpmVszRKEBfjAuJOV1ZYpwHwN7t4anfb3As5NndYQPDhdzJVRyOt40PPMdG3b4IHxJ9P3triGfEHeMR2AJ5TVcanMSrjLagqehVTTYC1S+c2BZS1jdxRsP1BZHZ33fSFgETz4elh9/NmqnqLkLrxzzgV7x+YLC/m0Ir+q4Ry74/GgPxfsWpriFsbvcG39E6C/ZWxpy9ghec/IK3is6XnmHDVfDgR0V1an40rE/njnPXfBrB6WZ4ihrDopqwHK6jhl9SRucfbHredvLGOjCo7dXj7QL58q/JkegLvVSqEwPtgN7xvvxBMUCq3e+VbRnm8vvLWQOskeuFtmFp7ZMR74KQmLU3FT+qH0vQ3eiY7DhcJ0y9j/KasrcQtkN/wlbIdrYVfgHUt98gWvx3PTb8DdQY/hHVSOYbg19B88XrBLXrpnjST3zkH42Ivf4oHhc/NcD3fh7o9T5OMa2uDZMLvg6Zqn4gL1r6QU1iLV5EY750btnosP/gK/v0ule755Ov81ad+yeEe/exFhUUXt92+T9P8nYO2UKro1bom1M08DFi4QblZWy+eEb95vfBqwb17sYhXcTbIdnlVUyB8sY/m/SW5esd1xBeML/Pm4EPfPXwY8nefCEbCMap+KJsdsxTMJ1lVwpWIkbrU9lH7b6cDayir3e66H/3bXp++v4mMwXsTHexyHP0fn4/En8N+od/qNfo8P7jwpry2r48/cVpax/+Vffi3tn6MftIx9B5wkHwS6FXNms0HpCSjzHCEwWj4r4y6M6cw5evaqgnK5Djk3NcU6uIDYKwVyX7GMPausxuIumM64Bvc67p5YoFjllrG5XrTkh/9L+ivGNMvYAbVdlGXsNjzzpXD7KykQOxAYbRnLz1w6jeoA8Hm4sPs4BddXxN0PO1jGXq6l6o5AR/lI8yNS+dzgvlfwNNZbcAtl87x2vZWCrMUCqgswZ0ZRIa+lum7OE2Kv4b/DtLzA+VTgojxh0Znq7KXNLWP5mUxbAKfjGvAthRVawRQfKaD8Ah6/+aNl7OG0/dd4x3wtfk8vS4d0wu/1mrVc10a45diFFF9I8aO7Ul33FsRs7sCF2xXp+yzcVZWb2mRv4DtLqdDKaj18wOg5Vj0VzDDgj/j78A2eUpzP8cB7ReJBHanhGadaAEyvKaGkYPu4ooXmA2RWH1dl0JpQVip4YZuizoPxEcmDmrLeVHdVDR16TeXb1GCFtBhSptd39bmuWs61tM09zUpuX5d6BoibhCSkawxul6mOdsASeS7VoAZCYARBEAQlEUHvIAiCoCRCYARBMyLpZknDJBUbJJcr00vSqw3ZFgTlJARGEDQTknYFqsxsALCcpBWLlOmBZ8Z1ru+2ICg381wMY5FFFrE+ffo0dzOCoE6++OILunfvTvfu3Rk3bhyzZs1ikUUWmaPMzJke6/7kk09YeeWV67UtCErlrbfe+sGs7lUE57m02j59+jBixIjmbkYQ1Mlhhx3G8ccfT9++fXnuued4++23Of3004uWHTRoEEOHDm3QtiCoC0mf110qXFJB0Gx06dKFn3/2geiTJ09m1qwWneEbBCEwgqC5WHfddXntNZ+1fOTIkYQrNWjphMAIgmZi55135s477+Skk07i/vvvZ/XVV2fIkBqTpYKg2Znngt79+vWziGEErYXx48fz/PPPs/HGG9O7d++6DwiCCiDpLTOra032eS/oHQStiR49erDnnns2dzOCoCTCJRUEQRCURAiMIAiCoCRCYARBEAQlETGMIGhuXi5jksYmdcYtg6DBhIURBEEQlEQIjCAIgqAkQmAEQRAEJRECIwiCICiJEBhBEARBSYTACIIgCEoiBEYQBEFQEiEwgiAIgpIIgREEQRCURAiMIAiCoCRCYARBEAQlEQIjCIIgKIkQGEEQBEFJhMAIgiAISiIERhAEQVASITCCIAiCkmgygSHpZknDJA2ppUwvSa/W97ggCIKg8jSJwJC0K1BlZgOA5SStWKRMD+B2oHN9jguCIAiahqayMAYB96fPzwEbFSkzE9gLmFTP44IgCIImoKkERmfgq/R5HNCrsICZTTKzifU9DkDSEZJGSBrx/fffl6nJQRAEQT5NJTAmAwukz13qUW9Jx5nZDWbWz8z69ezZs1ENDYIgCIrTVALjLardSX2B0RU+LgiCICgzbZuonkeBVyUtDmwL7C3pHDOrK/Op8Lj+FW1lEARBUCONtzCkpfAOvUbMbBIewB4ObGpmI2sSFmY2qJbjCmMcQRAEQRNRt8CQuiC9iLRmDSXOAb5AOre205jZeDO738y+rU8DG3pcEARBUF7qFhhmk4H/AZfVUOIwYBfgd+VrVhAEQdDSKNUl9VtgJaSt59pjNgOzJ4B25WxYEARB0LIoTWB4LOF04OzaSpWjQUEQBEHLpPYsKWkwMBWYAVQBqyKdAXxZUHI54KdKNDAIgiBoGdSVVrs3MD39AfyTmqf1OKmM7QqCIAhaGLULDLNNZn+W2gHHAddi9nPatg3QG7PbKtbCIAiCoEVQn3EYbYCLgfZ52yYAVyEtUs5GzW8cdthhDBgwgHPOOafkMuPHj2e77bajX79+HHnkkTVuC4IgKBelCwyzqYDwmEZu23B8FtnTyt2w+YWHH36YmTNnMmzYMD799FNGjRpVUpk777yT/fbbjxEjRvDjjz8yYsSIotuCIAjKRX1Hehser8jnLGAwUu+ytGg+Y+jQoey5554AbLXVVrz22msllVl44YV57733mDBhAl9++SVLLbVU0W1BEATlonaBIX2L9BnSp0if4hbGx7O/+7ZHgY7AJRVv7TzIlClTWGKJJQBYaKGFGDt2bEllNtpoIz7//HOuuOIKVl11VRZaaKGi24IgCMpFXVlSewHTgFl1lFsXd00F9aRLly78/LPnEEyePJlZs+a+1cXKZLNZrrvuOrp168Zll13GrbfeyvDhw+fadsQRRzTp9QRBMO9Su4Vh9jJmwzB7o46/azD7pInaPE+x7rrrznZDjRw5kj59+pRUZvz48bz77rvMnDmTN954A0lFtwVBEJSLuqc3l6qAK4HjMZtR8RbNZ+y8884MHDiQr7/+mqeffpp7772XIUOGzJExVVhm+PDhrLDCChxyyCF8/vnnDBgwgH322Ye+ffvOtS0IgqBcyKyEGT2kGZi1zfveBrNZBWXm3tYM9OvXz1pbdtD48eN5/vnn2Xjjjendu3juQCllglbKy2V8XjfpV75zBfMNkt4yszofnlIXUCqUKtOQfs7b3gYPfDfVgkzzFD169JidBdWYMkEQBJWkoQIDYIf0X8A/gE3L0qIgCIKgRdJQi2AWZi/P/ibZHN+DIAiCeY7GL9EaBEGrpFxT0gCMHTuWgQMHVrzNQfNSs4XhS7Lugc9U2wbpTNz9FFSKCH4GTUT+dDOHHnooo0aNYsUVV6yzzNNPP81+++3Hfvvtx7777suIESNYfvnlOeigg5gyZUozXU3QVNRmYSwLHAQcgq9/cWj6fEgTtCsIggpSzilpqqqquO++++jWrVuTXkPQ9NQsMMwew2xpzJad6y8IglZNOaek6datG927d2/S9gfNQ8QwgmA+pLFT0vzpT39ilVVW4dZbb23SdgfNS0MFRhukA5AORDoQENIB5WxYEASVo5xT0gTzDw1Nq50AnEP1pIRf4LPV3lmGNgVBUGHKOSVNMP9Q2tQgrYjWODXIbCJLav6kmX73mJImyFHuqUGCIJjHiClpgvpSmsCQ2mE2vcJtCeZDDjvsMD744AO23357hgwZUlKZa6+9lvvuuw+ACRMmsMEGG7DWWmvNte36669vsusIgvmBUoPeU5FmIP2I9B3S50j/h/QO0jCkJ5E2rGhLg3mOhq5nPnjwYIYOHcrQoUMZOHAghx9+eNFtQRCUl1JdUp8Dg4D2Rf46AAcC5wGblL+JwbxKsYFhhaONayvz1VdfMXbsWPr1q3a9FtsWBEF5KFVgTMfs89nfpH2AAZgdn763Be6p7QSSbgZWA540s6KT1xSWkdQDuBtYFHjLzI4sdlzQOikcGPb222/Xq8zVV1/N4MGD5yhfbFtQOxf8+4eynev0tRcp27mClkf9xmFIHfFOfSwwCGnrtOczoEYfgKRdgSozGwAsJ2nFEsscANydovddJYXaOA/R0MFjALNmzeKll15i0KBBs8sW2xYEQfmoWWBICyC9mQbm5bgL6AS8gXfmNyH1wuwjzB6qpZ5BwP3p83PARiWW+R+whqQFgaXwOa2CeYSGDh4DePXVV9lggw3mGDhWbFsQBOWjNpeUgBuBXYHFkG4ErsJsaNo/EulB4D68s6+NzsBX6fM4YJ0Sy/wN2B44HvgwbZ+7odIRwBEASy+9dB1NCVoKDR08BvDss8+y8cYbz3G+YtuCICgfpa7p3QM4BjgO+D1mt6Xt3YEt6rAukHQ58DczG55cT6uY2Xl1lQFWAE40s0mSTgImm9kNtdUVA/cSrWTgXgweo9l/94hhBKUO3Cs1hjERWBNYF3gh1dAFuKAuYZF4i2o3VF9gdIllegC/klQFbEDxpWKDVkxuYFhtgqCUMkEQVJ7SBIbZLGAXzMZgNiZtmwysg7R2CWd4FDhA0mXAnsD7kgozpQrLPAmcD9yAC6yFcBdVEARBq6chKx5ee+21DBo0iEGDBrHWWmtx5JFHMnHiRLbddlu22mordtllF6ZNm1axNtcuMKRZSD8j/QRUIf00x5/HGXarqxIzm4THOYYDm5rZSDMbUkeZiWb2LzNb3cy6mNmW5kIqCIKgVVPOQat33303J510Es899xy9e/fmmWeeqVi76xqHsSrwCx4AH4WPkch3C62Nz1JbfE6HPMxsPNVZUA0uEwRB0Nop56DV/EGq33//PYsuumjF2l27wDD7CACpDS4ovsFs6uz90lfAvUj9MRtesVYG8z7zYcA/mH+pxKDVYcOGMX78ePr371+xdpcewzBrP4ew8O0zgFVCWARB0NooVwyh1HPlU+5Bq+PGjeO4447jlltuKan+htKwFfekNkhbAGA2uoztCYIgqDjljCGUcq5Cyjloddq0aeyxxx6cf/75LLPMMg28I6VRV9C7XQ172gKPl701QbPQEE0rx9FHH80TTzwBwGeffcb222/PwIEDOfnkkyva5iBoDMXiA/Upkx9DKOVchey8887ceeednHTSSdx///2svvrqc03vX1hm++23B+YeoHrzzTfz9ttvc+655zJo0KDZ0/xXgtqmBmkPTExThDyBtG3e3unpL2jlNFTTAtd0vv32W3bccUcAfv/733PmmWfy6quvMmbMGIYOHdqUlxIEJVMYHxg7dmy9yuTHEEo5VyHdunVj6NCh9O/fn5deeom+ffvOpYwVlunevTsA5513HrvuuuvscoMHD2b8+PGzLZ+99tqrPreiXtQsMMymAdPwKTvGAA8jPYW0Aj48fGbFWhU0GQ3VtKZPn87hhx9Onz59eOyxxwD4+OOPWWcdn/Vl0UUXZeLEiU10FUFQP8oZQyjlXMVojYNW64phzMLsB8wG49N0TAHeRFoET7UNWjkN1bTuuOMOVlttNU477TT+9a9/ceWVV7L77ruTzWZ54okneOaZZ9h8882b9FoaQrnccTUFQ4OWSTljCKWca16h9DW9zb4C9kBaCrMfkGKajnmAhmpa//73vzniiCPo3bs3+++/P2eccQYPP/wwr732GhdffDEHHXQQXbp0adJrqS/5rrZDDz2UUaNGzZULX1OZQnfc4MGDZ7sojjvuOA466KAmv56gdMo58WVN5eZF6hIYXZDmDm67ZO2ct68K6IhZy1cpgznIaUf9+/dn5MiRrLzyyiWV6dSpE59++ikAI0aMmJ2dsdZaa/HFF1/wt7+1/FlcGjp4qk+fPhx++OFst912PPbYY+y0006zy8eKf6XTkPXccxx99NFsu+22swV2TdtqIhcfeP755znttNPo3bs3ffv2rbVMfgyhlHIl0crGH9UlMGYAL9awb4u8fVXAAuVqVNB0NFTTatOmDYceeij33nsv06dP58EHHwTg4osv5qSTTqJTp07NdUkl09DBU/nuuCuvvJIvvviC4447DogV/0qlnNYdzJ2AUQq5+EBjy9SnXGunrhjGL5hdPvvPJ/97MX2emrfvMszOrXxzg3LT0GyNrl278sADD/DKK68wbNiw2Z1qNpvlgAMOaI5LqTflcse99NJLQKz4Vx/KmWxRbFtQGUofuCcNBj4Gbk9reAfzCK0xW6McNDTwucIKKxR1x8WKf6VTzmSLYtuCylBXx680eO8pfC2M04GbMZsRQe+gtVNud1ys+Fc65Uy2WHzxxefalnMRBuWlZoHhA/faYjYd6QbgVcy+zS9R6cYFQSVpTODzgQcemOt8hcHQoGbKmWyxzDLLFLX4SiFWG6wfNQsMH7jXNX2e8+1wYdKxgu0KgiahnIHPoHTKad1169atqMUXlJ+GxSLMpiEtUea2BM1IaFpBU1Ju667YtqD8NDx4bTaujO0IgmA+I6y71kfDpjefxyjX9BAAY8eOZeDAgRVraxAEQXMx36fHlnMA0fjx4znooIOYMmVKc1xK0AjCJRcEdVOahSEtjfQ7pNvSjLX3I12CtFlavrXVUs4BRFVVVdx3331069at6S4gCIKgiajdwvBZaS8E1gTuBW4CvgO6AMsDBwOXIZ2MWU1TiLRoKjE9RBAE9aCVzac0P1PbAkrrAq8CL2O2HmaXYvYaZh9j9jZmD2B2ILAncBbSH5qozWWl3NNDBEEQzKvU5k5aGNgTsztqPYPZx8DmwPgytqvJKPf0EEEQBPMqtQ3ce272Z6kKOAyzG2ooOw24rsxtaxLKPT1EEATBvEp9sqT+BBQXGK2Ycg8gAmIt6yAI5knqCnrfDUxN37oj3VJDyVnAO5hdVca2NRkxgCgIgqBu6rIwvsJjE7OA6cBHNZRrB5yL9CpmI8vYviAIgqCFULvAMDtt9mfpKMwunGO/JMwsfd4CWA4IgREEQTAPUp9Bd9XrX0jbIr0BHJO3f0fMHqnpYEk3SxomqfjivbWUkXSNpNLXXgyCIAjKTmlBb19CbIH0+VZgY3xA392zy5j9WMvhuwJVZjZA0i2SVjSzUaWUkTQQ6G1mTxQ9eRmJ6SGCIAhqpm4Lw1NquwG5xXLPAlbG7AbMpiDtj3RYHWcZBNyfPj8HbFRKGflqfzcCoyXtVGdbgyAIgopRikuqDzAOGIR0G7AJ0A1pYaRH8HTbL+o4R2c8gE46V68SyxwIfABcBKwvqejcG5KOkDRC0ojvv/++hEsKgiAI6kupMYxHgB1xzX874BPgYzyDag3Mnq/j+MnkXFo+D1WxeouVWRu4wXxp2LuATYud3MxuMLN+ZtavZ8+eJV5SEARBUB9KFRizMBuF2T2422g6Phnh9sD+JRz/FtVuqL7A6BLLfIJnXgH0Az4vsb1BEARBmSkl6C1ygkUaDPwe2Byz95D+CtyP1BezE2o5x6PAq5IWB7YF9pZ0jpkNqaVMf3z8xy2S9sbHeuxej2sLgiAIykipAmNS+twL2Awzn3XPbBTSpsDLSH/G7MxiJzCzSZIGAVsCFyUX08g6ykxMu/aozwUFQRAElaFugeHpr4emz2cV2T8BaRs8k6qW09h4qrOgGlwmCIIgaB7Ks0Sr2TfAN2U5VxAEQdAiKd/yqtJFSEuW7XxBEARBi6J2gSH9MQW6c9/PQVqhSLlFgN8BvcvcviAIgqCFUJeFsQKwaN73PwLvIT2GtEbe9t2BzzEr4+K8QRAEQUuiLoHxNpA/Em4msBLwKfA60uVICwDHAn+pTBODIAiClkBdAuO/zCkwDLMvMPsdsB6wIfAZvsjS9ZVpYhAEQdASqEtgfAkUn3bV7CPgeVyg/B9mM8rbtCAIgqAlUZfA+B5YeK6t0gCk13ELYy1gI3wa8iAIgmAepS6BMQ5YMO97FdIw4GF8gN0gzN4FzscD4kEQBME8Su0Cw2w6PodTPhcBS2P2V8xmpW1/AzZBWpQgCIJgnqSUkd6LJPcT+LxSg4GdkT4BRgAvYTYR6TV8Hqi7azhPEARB0IopRWDsgafTCrc2ugNLA2vg6bTtke4A/oTZ8Eo1NAiCIGheSpl88PEa9/la39sApwI7IK2E2cyytS4IgiBoMTRu8kEzA54GnkbqE8IiCIJg3qXmoLfUF2mxepxr5cY3JwiCIGip1JYltTjwItLGtZ5B6oH0ALBLORsWBEEQtCxqdkmZPY30GXAN0kTgTuANYCzQBVge2AnYC7gYs5sq39wgCIKguag9hmH2f8BmaRnW3YAT8dlrJwNfAM8C/fGV8oIgCIJ5mNKC3mYvAS9VtilBEARBS6Z8K+4FQRAE8zQhMIIgCIKSqFtgSOuWdKZiS7cGQRAE8wylWBhPzPFN+g1S1yLlrkU6rCytCoIgCFocpQiMqQXf7wM+Q7pv9rre0kr4uhgPl7V1QRAEQYuhlCwpK/j+LT7x4P7Ac0jXAtvikw9Gem0QBME8Su0CQ/oan97809wWfF3vKcD1SMOBYfgSrddWtKVBEARBs1KXS2ogblEMzPsDaSGk3wEvAFfiU5z3r2A7gyAIgmamrpHe/0WahdlXQG468yWBD4HngC0wG4n0EnA1UFpGVRAEQdDqKCXo3S3vc1vgRcx6YXYAsCBSR8yeAbqkKUSCIAiCeZBSBMZGAEhnAA9jtm3evhOArdPnx4Ea3VKSbpY0TNKQ+paR1EvSv0toaxAEQVAh6gp6nw9MRwIXHEshnZ1XohuwFfAYcCFmPxQ/jXYFqsxsgKRbJK1oZqPqUeYSYIEGXF8QBEFQJupKq50KjAdmAA8V2f8kcDRATcIiMQi4P31+Dhc+o0opI2kzYAoefC+KpCOAIwCWXnrpWpoRBEEQNJTaXVJmZ+Gd+Cx8SvMfC/7G4Wm3a9dRT2fgq/R5HNCrlDKS2gNnAqfX3ky7wcz6mVm/nj171tGUIAiCoCGUMnCvA7As8BNzD+IDeBfYEagtxjCZapdSF4oLqmJlTgeuMbMJnqAVBEEQNBd1Cwyz0cBpNe6XrsXsuzrO8hbuYhoO9AU+KrHMUcBmko4B1pJ0k5n9ts42B0EQBGWntAWUckjLAj/OEa+oW1gAPAq8KmlxfBqRvSWdY2ZDainT38zuqa5aQ0NYBEEQNB/1XQ/jVuDB+lZiZpPwoPZwYFMzG1kgLIqVmViwf1B96w2CIAjKR+kWhnQSsB6wQUMqMp+Y8P7GlgmCIAiah9IEhnQUcB6wH/Ay0ufAl8CY9PcB8CRmMyrUziAIgqCZqd0lJfVAugO4ENgTjzP0AM4BnsHTX1cArgIuqGRDgyAIgualZgtDWhn4F/B3YA3MvkSqwlNrH8VsVl7ZQcB1wCkVbGsQBEHQjNTmkvoYGIDZByWcZzRwVjkaFARBELRMahYYZobHJgoRcADSSMzeSWVH40IjCIIgmEepK4ZxK9LuRfYcBvwD6VOkgyvRsCAIgqBlUbPAkNoAnwDXIL2B1A+oSns3AxYGTgTOTOt6B0EQBPMwNQsMs1mYnQssA7wCvIpPBPgs0AYzw+xxYGNgO6STm6C9QRAEQTNR90hvs58xOxXYAndF/YBPd57b/xWwP5BBWqQyzQyCIAiam9KnBjH7Jz454JrAYgX7XgWOqGNNjCAIgqAVU7/JB80+RVp7jjEY1fvuLVejgiAIgpZH/SYflBYEYoWiIAiC+ZD6zlZ7KXDFHFukBZCeQ9q4bK0KgiAIWhylCwzpFGAr4OqCsRnd8fW5Y5bZIAiCeZi6BYbUBulC4BBgQ2AV4FKkdgCYfQsciy+rGgRBEMyj1DZwryvSZsA/cStifcy+wOwGfJGjI2aX9WlEplW2qUEQBEFzUpuFsRfwAvA5cCJmU/L2nQAch5SfZVVFEARBMM9S2+SDNyG9DZwEfIB0NL7+xbH4wL3OwBNI3wCLExZGEATBPE3t4zDM3gb2R9oQuA14B58iZCrwRn5J4A8VaWEQBEHQIiht4J7ZP5HWAZ7DF1Py+IV0DDAGs8cq1sIgCIKgRVBKltTiSAMx+xH4DfB/eXs/A25BWq1C7QuCIAhaCLVbGFInfInWV9PffsC0FM/I8R7wJFJ/zMZWqqFBEARB81KXS+pu4D3MTkjfLwbuxFfdy/EpsCgwEHiw7C0MgiAIWgR1CYzbgGfyvrfB7NDKNScIgiBoqdQewzB7DLOpQG4FvuOR2iGdVfmmBUEQBC2J2kZ6d0L6NH2+GJ908BHMpgOnNEnrgiAIghZDbRbGNKBH+jwcWA4YhbQ/MUgvCIJgvqO2kd4zkGakzw8BDyH1Ado3ScuCIAiCFkVdQe+uSPcU2d6pyPb2mO1epCwAkm4GVgOeNLNzSikjqTtwLz5P1RRgLzML6yYIgqAZqEtgzGDOKUBy7FywvQpYoKaTSNoVqDKzAZJukbSimY2qqwywJXCZmT0v6VpgG+DxOq8qCIIgKDt1CYyfMbt89jepP/A9kJlje90MonqBpeeAjfBFl2otY2bX5O3vCXxXjzqDIAiCMlJbllRbclOWS1sivQk8AaxO/acy7wx8lT6PA3rVp4ykAUAPMxtevKk6QtIISSO+//77ejYtCIIgKIXasqTaAq+kzx2B6/FpzJ/EYxjtkPYosZ7JVLusutRQb9EykhYCrgRqHDBoZjeYWT8z69ezZ88SmxQEQRDUh5oFhtkvmO2MtChmT2B2E2bTMZuJWxndgCFIJ5ZQz1u4GwqgLzC6lDKS2gMPAH8ws89LuaAgCIKgMtQ1+eDRwAVIa2I2GukKfAzGrFTiHeBipG8wu6+WMz0KvCppcWBbYG9J55jZkFrK9AcOA9YBzpB0BnCt1V5PEARBUCFqi2HsA/wJ2Aaz0WnrMcCPeIrrFHziwbuAQ2qrxMwm4UHt4cCmZjayQFgUKzPRzK41sx5mNij9hbAIgiBoJmqzMB4GXsPsSyA3l5Rhlp2rpKS5thVgZuOpzoJqcJkgCIKgeahtpPdU4Mu8LVXA+TWUtbK2KgiCIGhxlLZEK5AmHTyzck0JgiAIWjJ1L9EaBEEQBITACIIgCEokBEYQBEFQEiEwgiAIgpIIgREEQRCURAiMIAiCoCRCYARBEAQlEQIjCIIgKIkQGEEQBEFJhMAIgiAISiIERhAEQVASITCCIAiCkgiBEQRBEJRECIwgCIKgJEJgBEEQBCURAiMIgiAoiRAYQRAEQUmEwAiCIAhKIgRGEARBUBIhMIIgCIKSCIERBEEQlEQIjCAIgqAkQmAEQRAEJRECIwiCICiJEBhBEARBSYTACIIgCEoiBEYQBEFQEk0mMCTdLGmYpCH1KVPKcUEQBEHlaRKBIWlXoMrMBgDLSVqxlDKlHBcEQRA0DTKzylciXQE8Y2ZPSdobWMDMbq2rDLB2XcelY48AjkhfVwY+quDlLAL8UMHzR/0ts+6oP377ebn+ZcysZ12F2lawAfl0Br5Kn8cB65RYppTjMLMbgBvK1djakDTCzPo1RV1Rf8upO+qP335+rj9HU8UwJuMWA0CXGuotVqaU44IgCIImoKk64LeAjdLnvsDoEsuUclwQBEHQBDSVS+pR4FVJiwPbAntLOsfMhtRSpj9gRbY1N03i+or6W1zdUX/89vNz/UATBb0BJPUAtgReMbNvSy1TynFBEARB5WkygREEQRC0biKIHARBEJRECIx5GElq7jYEQXMgqbOkBeouGdSHEBgVRNJ6zfXQSqqyZvY3Sornq55I6ljGc1Wl/y1ScZBUkaQbSTsDVwNLVOL89SV3nS31d6gP8UJXAEkdJZ0HvAhs38R1t5G0PNBZ0uqSHpO0RlO3AcDMZqXvXZuy/sZQqU6shHo7SVoP2DAlejT2fAcDDwE0t+JQDElbAr/KCbUynXMBSRcBpwJPmtkn5Tx/A9u0OfCmpAFmZs31fKW2LCSpc/rcoL4/BEaZkdQfuBvoClwI/EbSKk1Vf+qkB6Y2XAT8FRgraT9J3Stdv6SlgeMlLZG+Xwjs0ZKtDUkrpM4aM5uRtjWZNihpVXyc0fLAisDPjThXJ0mXA1sA3dO0OS1Gu5W0XGrfBkBH4NflsMIlLQncBXyLv3d909xzvdL+Jn/+JGWAQ4GbgfMktc09X83Eb4GhkvYF1m6I8GqxL3FrRNIZwJnA28CDwKvAKsCJaX/XCprhHSXtJGkZYAdgdeBPwFTgEWCymU2sRN15bVgIH53fDhgs6RlAwK05a6OlIakDcDhwk6QTJZ0u6Y/AupXWTiV1kHQ0cCzQD/gNsCH+mzXkfBsATwAfmtn+wDbAPpLWTNptswqNZDnthc+L1AM4DxgMLNaIc3aStCBQhU9UepmZPY7PQ/coafyCmc1qquuXtIakh/BxZAcCH+DTHPWQtKak7ZrK8pGzXfo6AXgXWAY4BuhT3/OFwCgTktbGJz78G/ASrikegz8sD0jaE7iAyvlVq4BTgAOA44HTgftxDWdr4D1Jy1aobiQtDOwOTMevcR9guJmdljqrpVO5Ztd0c22QJDObCryCa4HvAhNxodcBF3aVakNPYE1gYaA3sBvQCX+Jd5HUVlK3Es7TRtLWkk4FpuHC73lJTwAbA39Of83mmpK0mKQXccv3KuDH1Lbh+LvyZQPP2wU4CljBzD4Hvpd0mqR1geVw4TtR0unQNNcvqR1wGfCimZ2Nv/Pr4Rbf1sAlwCQzm1nptuSaBJwk6S/AqribvBewKbC7pPb1OVkIjDJhZv8GjgO+xzX8XYHb8Acmp0Velx7sspI6vim4G2oxYDVgf+A/uKXze+B9oFuufLnbgAus1YChuKvhYuAzSb+WdB2QkdSuJfjTkwBbPK8tiwE9cSH/C/AkMAhoU4l7JZ+5YBCwLK5pLw20Bz7DtcDfAn8ANk6dYm3XMgv4ELcmfgJ2xJ+5O4CzgRHAfySdVe7rKAVJOwKP4W62fXEFpjdwJy4wepvZ9ORfr3O21LzzdjKzyfg9O0TS7fi8c9vhv+Of8BjOGrilX8kge04BqTKz6cBfgIMl3QU8hf8WF+Lehu3N7LVc+Uq0p4AFgdeBZ/H3/w/ATGA/4H/AmpIOK/Xeh8BoIMnvvX/+tuTyWRe3NA4BPsfN7g54x/1Nmepuk/d5pbyOry3+UB6Ma5U34VbHp0AG9+uqEp22mX2Hm+D/NLOjzOw64NepDZPM7LDUMSwnqU+56y+FvBe7A3C5pDvTb9gNeD21eUPgaWBnM5tWIQHXHn8mNsI70Rdwq2Zz4Dvg37iPvxtuNdR2LW3M7Au8gxwD7IkLiq9xgXQJkAXWk7RZ7pgKXFOxNnbCO+9cp9kbWAjYA5iC34fpks7HXUe96zhf7pr3Au5Jm7/HXXlT8ev8AHd7fYdbjTcDW0haDI+t1Wm11Yd0/w0gZzWY2dPAucDvcEXkWvz9Px+4T9Ib+eXL2JZiyk1bXJk7AVgct6L/g7sF18Ozydqb2fel1BECowFIWh9/CBZO39uk/wKuMLPd8LmvLgYuNbMd8B9tyzLU3SYv++gM4LL0MoA/DF/jbqH/4ZrF74BRwABciF1Wl9ZaQhsWL2xT+vgiMElSe0kX4CbwcODxVG4/XOtr8oWwCl7sqcCRwDn4i9wJ+EnSNcB4YBfg9nRcoy0MSeskVwXyBIhOZnYXLsxfwu9TB+ALXLi/jM/O3NXMphWcS+kaTB4sXyBpzn/CBXQGeB7Yycx640LpAOA0qq28isaTJC0sD/D+hD8T/8HdlD/iWYO74B37isAKuKB8FlhQHgcrSp7wfhaYnNy8bwB7A7PwpIF7cKtqZzO7x8yuwO/LHcDXZjapnNeaYiOrSHpF0hZ5u4biXoXLgH8CY/GYyl3AV5I2LGc78p9vSSvl7foBt/C64u7Av+ECe0jatr+ZXVtqPc2W4tVakbQaLgyuTMG13EOzAO4v7C7pbtxPeCTuLgC3PCZKurchWmvOMkh1rYZrjh8C+5jZj6nYOOBGPMD2e+BkM3tX0nHAA7h//vNkyjcISXvgGRY3m9l/oTp9FvgYt6pWxTWrzfA1THZMbZgB7GZmnza0/ga0dyk8GDo6fc8CD5nZf4BxknrjcYMfgHfwl+tcYGVJL5vZu2VoxhbAKck1tzvwL7zDBPgEd4k9nP6vjvu6P8I7mPxraY/75v9PnoFzLHC9mQ2RB+rPw12hN3lxnYy7Il/CBdKSuIvm32W4pqJI6gdsbWbnSvo13on/gFsCw3BX3Db487sx7gL8EO9gL8ITRMYVnLMT7t69Hn+GdgQuxTvjkWb2T0lr4hr8VPx6fyXPEFsBX1PnTDP7ijKQFKQuuPBbFQ+wn2Vm/0j7u+Ma/WT8HbgqXffRuAuyG+4+LAs5JVKeVHAtsJSkjc1sZtr+Pu4S65naujIuxE83s8/zlZA662oBLuVWQ3KlbA+sj2s2t5rZz5K2xbMhzsO11bbpby1gqpldJ2kTM3u5gfXOdiOll+B43OTeIj0QWwPdzez+VKYNbll0MLPzGnzBxduyOh4feQ/4u5lNTFrr9GLuLkm748H/+5LLp8lIltcluNY5GteqAH6bYj5IWg4Xuuem7xvhGUvLAc/gKz42SiOXp+wegQuHNYAn8n6rPfHO/DNcsPwG7wQPSfvzLYp+uKuxPW41nofHO/5iZuMlnYC7JI/DhfeNZnZ+Ok8vYGEzywmqspIE72q4dv8knim4d2rDkORG6g6shHdWz+GdezfgIDM7W1KHZP3ln3dz4CS8822Lu5uG4MJnMWBbM9srWb1L4Vb02FRHf2C0mV1dgetdB1fOvsV/g/7AlmZ2dPrNFky/SVdgfTN7MR33CC7knylTOxbDhdNXuCD9Hk82ubKgXAdcmHQHrsPdUV8Bj9TH6gqBUSJJUz0WfxHH4FrpV7jlsDb+0r5ccMxcL0Aj6u+Fd34zcL/jiriZ3yO14Y9m9nZe+cWAy/FA+z/K0Ya8c++MZ0H9J9fR5u1T6tyqcI1xOeAcM3urnG0ooY1tzWyGPOd8X2BR4HIzuzvtb29m09KLNBTvtD6WBy4b5VtOHcZfgTvM7C1Jv8U7ut9T/Zs9hHeYK+DZUr/BtdYZ+H09O51HeS7I3+HZVOencofj7sdPgLOT0L4beNzM7iu8F425pjqud3Vgk3QtD+IWxFq4e+zXaVtXXLPtgnf0fXGhsTbwVn578867GB7fGWxmr8h9/xcCb+JC40L8nn5sZpemY87DM5RezCkyFbjePfE44ct4MH8dPH63InCemT1Vy7FljyFK+gduqe2L90ubm9mlyeuxvJm9J08+2Ay3tCbLM8mm1deCDpdUDSgvVpDoCbxqZn9P+3+Pa85vm9nOadscD0O5hEVifWComd2c6pqEd9odzGzrwsJm9k1yvzRYo5Sn4W6Ij5odn3fuRyXtChwlaRqu8d4DTAK6SvoSt7ja4i6oJhuDkeskkrDohr8ka+Iv1JQ8114uNjAL+Afuxmt0IDJPYD4AnCHph1RHDzyt9HtgK1yzm4G7l2YCM8zs4ZwgS20xwCQtigv/bsBVZvZk6kz/iAudXfM6xtOBOyR9YWbD0nNc6cFiSu1YCLduJuKusEXxmFAXM/sE+EQ+wntD3PXUGY+7vT/7RHnvUHqG78GD1b/GY2ET8NjhOFx4XgFcIektMxuKd9xLp9OV7brzFJDdcMVxJzyp4HzgFjO7T9KteJbWU/Kso0mFfUAFhEUH3OX3Cm7574rfrw2ozlT8k5k9gY/RyWVzNUiBi6B3EdJDO0sevB0MYGZv5wmLo3At6nXgBZU58yLVsYOknFtiITN7wsxulnSApPPM7GPc0viX8sZXSDpB0gqpze839AGVtBP+Uo/F3QG57bln5mr8AXwGf0HPxdMXs7gm+YCZndBUwkI+2dxZuGZLctO9ggccB+Kd2MHAbZKuknR80tjXAW4yT4tubBsKtce1gI5mdhTuVnkfT/3sg88AsIg8PXSUmT2cjimmER+Au8kOAlZIL/w3uAvonvxjzOxL4BY8yYEmuv8/4x3lQri19gPuZtsH78R+nVf2y9ReM7Ovzey93D3LE7ad5LEykgX7Kv4MrowLia3w1NAV8CSBv+JuIfAkk1vTsY3unCUtKekOkjvTzB4ys43xZ+qF1J6PJL2DW48np0NPwwVm2dGcWW498ASXJ/GElza4InEqcAYwEh/Xk0vMadMYpSgERh6qngPJkkbzAP5S5kZMLpQ0nvXxWMa5uHm9lfKyFMrQjio8Y2ZLeTDzMlVPL3Ivnia4rZk9m8rtJmml9GBviPtVG1r3QvKBTsvj+fxtcH9nodW1IPCzmb1rZkea2b54FthlwH+tEYH1BrR5U9zt0dbM3kybp+CxilvNx748jFuEp+H+7464K09m9lkj6++bOn6TD1JbAn9GBgHLS3o3ff4iWYjf4YI1A5wrH50PzH72lpG0bBJqCyV3y/14fv8YfHzI9Xgw9Xwzuyq142T5tBsPmdlljbmmWq61XZHPq6R23YBbcxvj9/hi3II4Tz6C/g94J3q15fnNC967jfDfaqW8jvE6XGB0wzOtHsNjFJ+Z2c94XOjL9HzOETBv5LXui79vbwPLJCUqx3t4cP1C3L3499Sm5eXZgDkLpFxtUVIGuwDtJG0kSXja9UfpGf4Wd/09Z2a744rJPvhzvgOUQYEws/jzfr5j3ufT8Vz8LfK2LYC7pbYvOG4HPD1zkzK0YSGq40oL4+mRo4Cl07Z26f/GeGAr16778If3kEbWPwjv8PfAc7ZvTud+Dh9Nm9+GlYF7C+9dM/xul+JxpbXrKDcivTyN/p2KnPt0XHnYEU8j3gJPo+6Vfr9BwM64pbA4cCWwXTp2mYJztcMzgj7AFZYuuGuvXfo9DsK1+OtwLXI3vMO+B7cselX4fp+La/PL4AJsNVz47oFr+o/hwvsWvNM8MX1eDziyyPna5H3Opvu3cZH3YlN8sNnruPWyAZ6ZBNCjzNfYKd3fe9N7uCEuNHYoKLd5avON6T5sjafNPgUsV8b25PqEc/AxNhfiWXSbp+1VVI/ivge3Rq9JfwsD3crWlko+XK3pLz3YN+L52lfhfldwiX0TcFdB+bZ5+3dv7EOLC56z8XjA9umh2x/3ke6Te3DwFFFwTfnu1Na7SEKlgXW3T9c/BM9iWRwfkXt22r81HsDNP+YoPKDeXL/XGqljeg4PqoMP/OpQUK5dur7HcIHxON7ZtW1k/QvhmTDgAvwJPMh/Om6Z9Uh1X5raunLqDI9O5doXnK9D+v9bXADOSs/dA3iaKngmznbAVun7Wrii8C5wcIXvd6f0fzPc2rkBHw/xh7xncy28kz0c2D1tv44kHGs59yr4fGdvASelbQvgmX65enPv4414B74T3oG2I3WoZbrOtXHXzu54R/zHdJ0D0v6B+e9N+n8ecEr63KfM930DklBN9/XvuGKyZXqOuqZ2/gZ3ue6Lu/0OzjtH2e5PBL2ruRu3Kh6x6hTLbfGUtdeBLpJ2MbNHkq91Rvr/I95xNYgUnDIz+7t84E/OJ30obtEcBqyWgpj/TGYouMazJ57a26C0wWTyt8G11km4NXE8rsHNBH6RNATP2+8j6TBLQXfctfDw3GetPPJ5k3bAs8YmAkOSy+46YLykMcAYM7vQPHNoIDDRzP4m6XFLKbWNqL8jHpP4mzz5IJcl82cz+zHFuA40s8slXYtr2O1xwT4MT5SYlne+VYBD5YHs9/DA5UX47/G1mb0PYGbD847pjAv5n4BdzIPKZScFVQ/DB3z9hKeuLgCcaGZfSzpH0k5m9pg87bw98F98HERb4DU8qaDwvL3N7Ft5BtkBeExsOLCZpHNxIfyq+eA/zDN72uEa8xfACDN7rAKXPAZ/9sGF3weWkkrkywQcLelLMxud9xv2w+MFWBrvUw6S++lwYKo8bXltPKb4jpl9KR97coCZXSPpYzwhoCPeh+SmQylrVtZ8LTDkWUTXmNlYM/te0tn4KFTS59VwjdBwrXtjSYuY2Q/Q+KCapDNxLeow+eCkAXhmx/G4NrwVnnc/Cc98eNt83MfSeCDx2PxOpJ5190zX3BEPjH2Am7Dv4y/5r3GBVYWb6M8Ce0l6P9X5OZ6BUZGpRupgIrCvmX0lT3deB9fgD8Xv30p4XCnXtj6kTquxwiKd4xdJT+GjwauonrRwT0nb47GJ0yStb2b/kg/Y+9nMHsmdoyAetD0+a+sPuPa+Az6I0IATJb1uKZCbjt0W13wfN7OLG3s9dVzrVEmT8alUqqieSXdT+SyxjwHHyVM738eVjx1x99EN5iPa50A+XmSLpIxMAPYzszFp3454yu1FZnZD3jHtcMvwOTzIXnTKlIYgXz/mWzObkt6J43BhsTvwiqQdzbOMjgXeNLPRknaheubXCbjLs2zIExsmS3obd1F+jPcDH+AJE9vgHoHfS3rTUuxOPrjzMjMbC+XPypovg97y7JS/4trShKQJYT5ye4KkkcAU88DRprh2Be6GaPQ9k/QrSQ/jD/1+uHA4AR/cdTne4awL/MZ8DMXXeOd9YmrnF2Z2dkOERZ6FcoKk+/EOahHzIHUHPOtkEj7GZCqusYxPVte9uMmLmd1unt3TJMJC0qpK2WBmdoNVj9pdHtfat8CzZr4zsxcszZKbyryEd2yNbUP+NCE74Rr3OrhG/X6yvl5P2/8CZOVpvvfkhEXuHOZZeL0kDcKF7/V4vKMtHrB/GH8u1sFdWrk29MbdFGdWQlio+DxTHfAR6LfjvvN/kGJ8+DM8DA++j0oW90jcjTbHuh7yxIxbcYH+Bj7I7l0zGyOf9vsNPO34YmCmfOwRkgbgHoAFzew6M/uhXM+dfCDs0/j9BzxtHLcK18Tfu+uT9T/KzC6Tj+Q+FY8XvWhme5ajPZIWlLRyakPOytkbd3cegM//NCt5FG4ws5G4y/psVScOTMsJi0ow3wkMeX70b4BxZnaomU216kVzlsFHQ+6Mr49wO25uboILl93wDrQx9e+Ha/KX4H7gkfjL0y09AA/hmssDyZ2yA+7KGAEsLU/za9D8Rkkw9k9fe+FZJ4/iQVRSm47Brapl8JcFYBP5wlCrknK5m5JU9xGklNK0LWcdD8LjTk/gsZZl8o7LvURjLG8cST3rbidpV/n4iPxO4R5coF6Dp1h2lXSAmV1iZv81H0S5OP7s5LOofGW4LfE41YW4cPgZnyKjI+6SOh//XU7EhRDpWr4FzjUfc1B2rHqQ4DmSfpXbjGfgzMLdoCumv73MbKT5XETrJNcf5uncc9xvSQfhz3ZXfBLO3+H9z5hUpA0udI7Cg7cd8efuZDywfJ+ZlU2Ll2ez3Yi/DxvhkyAeklfkd/jz1BZ3P94OPCcfUX4VPr7kP2VsTzt8PrAtJC0tn6Z9S1xB2xJPEX8DmCXp4GTxgFvWS1I99qSizFcCI7l9BuJa24Ly1NlcrvUD+ND+j/EH5X7geTMbbD74ZjKwqfnMoA2pe0lJt+EdyAFUa2Z/wWMgfZLpOxPX5LeUj9o9Fg8EnpnaMqah2kwSjLdKehm3YJ7CTfydk4tFeKB1CTyT507cZ2q4gPmjmT3fkLobSrIqtsDjKC8BPSUtaNWD0RbDFwx6Dk9X3TanmVrjp/TogHeM2wK3S9pNUvdkNYzEn5HNzWwLvONfXT6CFkmnAS+b2Qt551sQF8h/xzXaDHBEaufjuB/8Odw1NRwPnF+BB3bzBWAlRi+3kbSWpIHy1PEuuLWJmd2EW0G34W6aa3A3Ws4CyHX+HxU57+LytRi2wMfEvIuP/r8JD3RfKmlTM3snafaYx2Mm4fdqWWBvM3uojNdahT/bM5LS+B1uZRyqtG5LeuevwTMP78UVtlPxd+YfZra9Vc/h1pi2tEn90BK4UFgOFxwL4srQVFxA/WA+zcvzQD9Ja6fnYWl8JPzoxralJKyCmRUt6Q8Pxh2Ja27r4JrSofjI1CdJKX94MPdC8tIdyUv9a2DdVXi2y3u4C2pBPAVxR9xffTDuZngMn+8H0ohRYFkrU6YDrqW/glsQg/FsnstS27ahOlPnajzY1gv3mS7eTL9Z39S2YfiLsQOupS6BuweXwC3CXMrvQrj7cPvG3i/c1XV63vftceF+P25d9E3br8W1wEPwKSsuwLXoHnnHds77fGW6hovTM3Aoni67ZV49T+Z9vqqxz18d15nLwFk2vQcjcKVmQWBQXrkFcEG3KB7bewY4Ie1bA59Zt9j5/4i7k44mZUFRPQDvLTxGdxU+F1rumL1w5WCfMl9rFe5K7IRbFU+l7SfhMbrL8aSXXPkN8Sl/wK3FPwArlbE9PXFvxjO4MD0xPV9D8Myx3+aeQfw9/XPa//dUpj0FWYGV/ptv5pJKwaDNgVfMbIqktfCXty0uod9J5WbPu6O5pwdpTP0r4QJrDTwdcGpqzzu4S2UP4FfAGmZ2ZDpmcNr3kzXyh5LPK3MIrgWOxgeW7YNPsXA63hFcg2t13+Nugc3Ap2hoTN0NbO+BuCC9NbVtF1zgnoIL133xdMZPzXzuKjObmXzAHzf2fqU2PISn7P47b1tnPIPnH+YB4UXwJIFvcCHb1QpGjUvKjdSejAvsL3Ft9XXckrgUFxqP4VlAhls2fwSutaR5V4rkpr0g1T0Un0blcDxNub/56HEkbYw/I+NwV2Zu6uxJNd1vSZvgv9dBuJVyI25tjMfv2b9xJeobMztH0g34732KlSnzS5o9gnyZdF3/h1tPR+BxiuF4TOh/yQvwupndkDwSr+FWxshytKWgXefjMdIOuMvrE7w/OBF3U/6Ap9a/hb+PH+DTkJwlqYc10M3aqDbPLwIjn+QbPB5/YD/Ghci/CspUYpKwQXgnMBnXMnfER67+F3d/vSXpTryTmsu8L2M7TsW1p7dw19jKuLZ+NO5aWBDAqkdNNynyGWT3xzvRRfEO9Wdcs18Uz+DKmtl7Fao/txLaDbgwXQ1YyDylegW8c5uSV35f3DV1WA3nE25lTJZPefEyrkXnrL3u+G/xGd5R7Ib/PreYT8Ne7uvLn/14EO72OhWfdv1GPGPoWEmXAAuY2TGqnpH4Ytzi3K/EunrhilA73H24Ie5q+RtuzayNK1AXpu/jrEwzuRa0oycupLbFXT6v4N6G6/HxRM/LR5mfldryMm51bQG8UK73UT6NUAd8Hq3rcZdvR9yiXRBXKhcF9jCzXeRZY1vhz+G2+LiUO8rRlgbRlOZMS/jDX9CXgH7p+z64RF+tCepug2vwWdzd8wKuQffA4wfr4W6NnhVsw4FUuxZWwLWWUcCZ+Myfb6Q2dmrC32RVCkz9dK8OxoPZK+FjAT7BF6jKlSnLgCSqB0O2KdieG6G8Pu5vFx7fuhx32y1GtdK1QAn1dMU7owPxcTTP4m6uHXAt/IF0vbuTXJEVuNcq+N4b79BXxeML2+Ad2iV4B75uKncYaUR/7t2pZ737pGftf+nab0vP3PZp/+UUjHovw7V2Tv8H4oJ4Wzx+eAHu0umNW3LD8HEvT+MWx9K4Zb9LGduyBB5g3wJPn18ovYPb4enIV+AxrT/i2VDPp30r49ZFxfunUv7mq3EYSdP7Nz5Hfy5w+Bru3thG0mhLA4UqgXkq5QO4K+Wk1JYrrNq0fFPSBlbZCeP+gWt3h+Oa3ot4ptYduKC6sIJ1z0Uy+6/EBXku/3wm3mm9gZvqB+Exp0dxSwMo2+Ryi+PrL//VzH5S9WI0bXHNfxweAP63mZmkn3BtcAaebfcU8KUVpJDWwJZ4ptEgXEgPxV0Qy+GaZSfgDGvg2JpSSNewUWrDfWY2CvhWPrbjZzwl+DlcYL9gbvX2wTuyr4GnrWHZSsPx6+2NW9SbkiYQlPSKmZ3QqAvLI73n3fHZgnOTPY7G3cEL4sJpP9zaGY6/A5/iMYrdcIF+o5VpYGBKdjgAt1gewBWCV81nfW6HZ4p9iieirIFn+n2Ax41WwJ+V0eVoS2OZL11SMEfHlEvbnGhmHzZR3QNwF8CfzOeqr8Lzq5tqTMPS+NQBZ6c4yetWAR9tHW3oiL8Mi+DuwYdx98R5eHxHVM/pk3uZZ+Ca4ENmNqxM7VgA1za7mdlxaZtwa+B3uItmCp7V8xKurV5uZsNVz/UWUl2r4x3Se7g7dHNcqx+KBzjLus5zXt3L4Z3Rr/GkjxfxaSwG55U5AbcEJuDB+D3xxIM18AGutzayDUvg93MxvEM8EpdhjR5MWaSu1fBn6htc0VgLt+TPwjX6X1MdNzwGTxs+DxdiZ1iZJjGUtDBuPa6DC4yl8ed6Ci44puNJBnvgQvRT3LI+EJ8O5Y/laEe5mG8FRnMjzy/vZmaZZmzDsfiLdEIlXtpa6l0G91v/B++Y7sQ12//gweEBeBaR8CDyM+m4vniWyP7WyPiKfPDV3mZ2ffr+OPCgFfEPJw37ITz19XFrxFToql6jYxXcRdELH1fx94aes5a68mMVh+N+8Jdx6/JQPAniDksrIcpn3P1JvgjR//DspU3wAWuflaE9XfGsxNfxAPjnuMAod6zwCHwsy1i8o/4BFxxj8SSX5fHspKPxGN4NeAbSx40VigXtOB1XPB7BLZcD8DjR5aldE/DYVTvcHfUWbtF+g1sUP1hTpcuWyHzlkmph3I5rm81CcrmMw2f8bDJhkfgaj5fMxH3GT+Oa+6q4ALvYzF4vctz3wM7lsATNl5Y9Qj4g78rUnjMkTcfnesoPcv6CZ9Z8SCMWpEr1TpL0P9wl8hZwaLm02SJ15YTFMbgGex/+m9+LX9OJ+CJY75rZP/PcsYvgy8hOxd1T5WrPj5LOr6QlnSzXcfgztR6eSnsh/vv1xOMkx6VyM/DV+WZIOtfKMCV/QWZlFR6LWB6/p8/iSS4b4MrCQnhc7mPgaPMR7zsCExro9qs4YWHMx9TXpVLmutvgWu77uJadWzvhQDwY+gmu8b1c7ow1Va+ethLuojjZfHW3vrhb7FBc4/whZ03I16Uol5viRHy6ldvLcb4a6ljEzH6Q9Bu8M9omZWhdgA8oPDSVOwMfKXyseVryLrjbb3M8NtPqOgj5pH0P4hr8F7hyNgCPIfXGp715CRcWn1ag/t64MrQv1RmId+NCelN8TIfhLrk9cMFxk5mNU4WX020sITCCJiHFBaryXwZJucVd1sAHLgkPyvfH0w5/sjLMl5RiRF3xCQCn5m3virtnRpjZK3nb98ItnRXxWVnHUEZUxvE9NZx/A2A9M7tKPgPuQ7gW2wnvSDvjgdQtcZfTEDN7Le/4Bc1sQqXa1xTIx+Msgqcv/xvPSFocFyLD8ESPKZa3kFMj6sqtffI9/uz+BU+ZfRO3Yo7ArbrV8bjQ83hA+y5cObndfIaJFk8IjKBipOD68mb2UsH2LngWzq14x3URrvmtgqdsliXQlwTC0nhHcQae6fMBMC2XEYWvQTLKzG4v1O5SoPiLlqzx5ZOynxbGM//+jlsTz+P+8ivMbJ1U7hQ88+w9fFDaL0mglyXzrCWQXK7X4QKyAy4cJ+PZSU+Wua6euKW6FC6Uh5rZOWnfVnh69ln4yOz38VjK6qlNm6X/X7SGez9fzSUVNDmnA9dIOkvSzZJOlc9ZtTVuVdxhZqfj4xC2xTu6pZS3ZGlDScLg/FTXsFRHWzxzZo3k5pqFj4XZK7nn5hAMZvZpaxAWql4qtQcudPfCLbXF8BHMrwGvShqSrKedce33UVzT9chzK+iwSiX9bifh92H79P8HfPBeuVkCzyZ7Ds+sW0q+xO5DaftleDbUJHyw7vb4eIxFgA3M7PPWcu8j6B2UnbyU5TfxNMEX8DTGyXjg72N8YZpcCun6uEuqF55q2yg3QU4YSLoXj4msifuwN8fXk84f1T8W+Nx8FHNzrO3RKFLwdq+kUX+Ga67j8EycA4H+yfV3Cj7mYAw+fqQKn09sBJXpRJudlGCwPL6ccUUW+5Kvm7M+Pv9Xbm31NXDr5hpciHTGhfdq+FiXt/CYxcRKtKmShMAIykKBXz7X6Q7ER/X2xTvluVwBqcP7HJ+baSzesTek/vzOvg0w08xek0+5fTuenTIo57NO2VHT8MyZFSR1b40vcHInPY1bU8vgKcpfAyfj1sN5uFY7Al8rO39Kk8vKkRnUkjGz/0r6pRLnTi7P9vgkgf3la+x0SNsOwN2sr+AzO7yIx+veA96w0gZ6tjgihhGUjdT5dzaz/6Xv1+Nxg23woOPL5utE5MrnJoXrhnfojfKhyyeYnGXVk0fuhWeirIQLsSww2sxekE8i+BMuMLoDn7Q26yIf+UjtjXHt9RDgYTO7OWnAG6XtfzFfVrXZsuPmVVI22u/x7KuJ+DPXGTjP0pxnrVUpySdiGEGDkbOlpN9K2hB3gfRL+/bG54f6H65dTQPWk7Ro7thcB21mkxrqQ5ePniblr1+JL+TTXtKFeMe5G2nKcNyiPlG+nOjzuPWzuDXhyoEV5D3cwvgzHtB+JfnQp+G/y+K45ksIi/IiHyF/DJ4BdRdu0S0BXGBpJof0vLdqYQFhYQQNJM86WBc3yb/CNdknzexK+Syoy5nZLan8WngM4XNc+23s4ka5qUPOwH3DP+LzMvXAB6h1sDT4SdL+wCZmdniyghbF8+RXtAqtXNccSFoPFxYT8aSCa83smeTHHwycOg8IxhZFeg43MJ8q5ih8nMUD+JxVPfClVFu9oMgRMYyg3kha2HztgEXwVMLX098H+LKv4JlJsyfRM7N35Kvn9cfX5Hi3MW1IwmoNPPvpQ3yuqUPw7JOrbc41PEYCv0lxi1/wwVzgQm5e4h08XnQCsJmZfZ+2jzazU5qtVfMwSQAPl0973x84zHwa+3XxcUTzjLCAsDCCBiDpWjzL5ms8A2pLPG1wqfQ/gy+qMz2Vz80A2wWftrtRwiKdcx/csjkFTxPdCnjLzI5N+9fEA+0T5fN2LW9mRze23paOfGbU4/DpJa5s5uYE8xhhYQQN4SR8mu4ZuF/8FzzItwHVUx9USTrWzL7MuZ9SRk45hEUVbsk8hGdVbYpP1pabSLAnPhuwJD2Kr9Z3c2PrbQ2Y2YQ01qVzc7dlfqQ1pmbXh7AwggYhqQe+nvUC+PQZn6VMnV+Z2UWSVsTXT17AzP5cgfpXxQVXR6rX9lgSWNnMTk1xi3H4gK1tgL+3hkF4QdCSiSypoKF0xKeFvhefggJ8srVR6fMx+CptZVm3ohDzGWuzZnZAikt0xGMSXyRhtQ0+wd8vZvZoCIsgaDxhYQSNRr7280/4FNITgKPwbKgTrUILAuXV3Q6PX/TFU0gfxWe+nYTPk1TJ1QuDYL4iBEbQYPJSa9fGx1rciruF7rEyLW9ZQhtWAu7B00n/jE+N8Tsze6op6g+C+YkQGEGjSdNv9DeziyV1TC6ipqq7C54pdQ0e6J1oFVqQKAjmdyJLKigHs/BxGDSlsEhMwQdK/WBm3zVx3UEwXxEWRtBo5vVUwiAInBAYQRAEQUlEWm0QBEFQEiEwgqBCSCxRy74OEidIdGnE+ftIrNvQ44OgvoTACII8JAZI1JplJbG1xI8SI/L+PpJ4Ja/MUsCHEoNqOM1R+MDGknzCEgtLbClxlcS1afN+wKWlHB8E5SCypIJgTqbiEyjWxkx8WdMX8rYtDSwEbj3gmVvnmjG08GCJJYGzcGHxvjTX+f9kxh2pbBU+C7DwyR0vB16WED4dyu2lXlgQNJYIegfzNRI34stp5pYqrcJX4Mu3MhYEdjTj6XTMVsBf8Fl5c6wObIivl/0g8K4ZfyhSXydc0DwBXGBWbWFInI1PaTLAjJl52zuYMVViNLCRGWMkOgPf4gIuxwLAR2asU9/7EASlEBZGML8zFXjQjP0BJNYCnjGjd66AxP/h047kmIJbB2fhizhNT39v4dOUvGvGHyR6AL8yc1eVRE9cmHwDjAUektjXjF8kBuLrWKxXICwESJrtPpZEe2BX4BUzts8reyMQY1GCihECI5jfKXWuq5nggWagF3B62n4rvjTs27h1MiMJC+HuogkwO7axDr4y4P749Oy/BZ6X+DM+ieMRZnxcUO/KuEXSCV/B7f/wSRZ/Ai5IguRzYIVU9rYSrycI6k24pIL5GonL8eDzD2lTe1wgfJlXbAlgSzOGSmwA7I1bHIUvT1vgRzOyEn/CV//bwmwO6yS/7s7Am/haIkea1bxmR7Iefouv2/0lsBgwLlknM4G2+e6tIKgEYWEE8zsGPFrEJdUnVyC5pLyw8UYKat8K/Jx3nirgP8B+EufhqwBuVIuwWAWfLHEhYGszXqypgckFtQXwPzyYfjw+HctpwInpGhaUONCMy+tx7UFQLyKtNpjfmTtHqTj5ylUH4Esz1sj9AYNxN9N2+JK1m5kxTuJAiY4AEl0kdpN4Al95cD18tcKbJUanvzESlhezAJ+J91U8MH8jbhEdCCySV2Y6cJLEnvW8/iAombAwgvmdNsDuEtuk71VAd2m2iypHvmCZAiwi8R4+nfvPuPY/wozHJZ40Y6bErngH/7HEx/gEjR8AL+GZUH0KG5NiJB+ZMSt9XxQPrm8JbAw8A/wN+AzPyALAjMkSZwIXSTxak2UTBI0hBEYwv9MWuM+Mg6B4llQOia5m/GjG6ylG8Si+Fsdw4GvgWYAkLHbDg94HmjE8nWKVdJ4+wB9raVN+LKIncJ0ZH+SN1zgS+CrvvDnuxoPxGzPnGJEgKAshMIL5GjMGF2yaCXSV6GXG2NxGieWBdyU2wsdK7A6zB+UZ3knvJHEgcCaeIrubmQuRIvSQGFNkexV51owZ7wPvp685N9XdwEtpTMdCqf6coFrXbI7YShCUjRAYQTAnHwMjgFES+cu7CncHjQf6AZsAh+CuogeBq9PnEan8+maz1zcvpCMw3owlC3ck6+Mziar88RiJbsACZowBxkpsDzyCu6gACGERVJJIqw2CBiKxOdAOeNYMSwPz9gGuMWNGLce1wTv+KY2svw2eThvxiqBJCIERBEEQlESk1QZBEAQlEQIjCIIgKIkQGEEQBEFJhMAIgiAISiIERhAEQVASITCCIAiCkvh/Qd1hRnYWIgwAAAAASUVORK5CYII=",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "import matplotlib\n",
    "matplotlib.rcParams['font.sans-serif'] = ['SimHei']\n",
    "matplotlib.rcParams['font.family']='sans-serif'\n",
    "\n",
    "# 柱形的宽度\n",
    "bar_width = 0.6\n",
    "plt.xticks(rotation=35)\n",
    "x1 = [x['mv_name'] for x in temp1]\n",
    "\n",
    "y1 = [x['value']['rate'] for x in temp1]\n",
    "# 绘制柱形图\n",
    "plt.bar(x=x1, \n",
    "        height=y1, \n",
    "        width=bar_width, \n",
    "        color=['skyblue', 'pink'],\n",
    "        linewidth=1.5,\n",
    "    )\n",
    "\n",
    "# 26 # 为每个条形图添加数值标签\n",
    "for x,y in enumerate(y1):\n",
    "    plt.text(x,y+0.003,'%.3f' % y,ha='center')\n",
    "plt.xlabel('电影名称',fontsize=14, color='blue')\n",
    "plt.ylabel('评论分歧(反对)平均占比',fontsize=14, color='red')\n",
    "plt.title('豆瓣Top10影视评论分歧占比统计图',fontsize=15, color='green')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**分析过程：**\n",
    "\n",
    "通过PyMongoDB的MapReduce过程，最终得出的豆瓣Top10部分影评分歧占比统计图如上图所示。\n",
    "\n",
    "1. 从整体来看，Top10影视的评论分歧都相对较低，处于`[6.1%, 11.1%]` 范围。\n",
    "\n",
    "2. 其中占比最多电影的为《**盗梦空间**》，分歧率为 `11.1%` ，这意味着有100人评论，那么就有将近11人的观点不被赞同，这跟电影的题材、类型、剧情、演员等多个因素都有关。\n",
    "\n",
    "3. 占比最少的为《**千与千寻**》，仅为 `6.1%`，这说明观众们的观点大多是一致的，100人里面只有6人左右的观点不一致。\n",
    "\n",
    "综上，对于Top10的电影，除了评分、观看数等指标，评论分歧率直观体现了影视的影响力，这意味着观众可以选择这个分歧率较小的电影作为参考，达到更好的观看体验，同时对于同行，能更放心地借鉴其中的一些高深的拍摄手法、剧情演绎方法等。\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3.3 统计Top10电影在2022年的评论情况"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'_id': '12920520',\n",
       " 'rv_name': '大头绿豆',\n",
       " 'rv_time': '2005-05-12 20:44:13',\n",
       " 'rv_info': '距离斯蒂芬·金（Stephen King）和德拉邦特（Frank Darabont）们缔造这部伟大的作品已经有十年了。我知道美好的东西想必大家都能感受，但是很抱歉，我的聒噪仍将一如既往。 在我眼里，肖申克的救赎与信念、自由和友谊有关。 ［1］信 念 瑞德（Red）说，希望是危险的东西，是精...',\n",
       " 'rv_mv_id': '1292052',\n",
       " 'rv_action_agree': '17526',\n",
       " 'rv_action_disagree': '708'}"
      ]
     },
     "execution_count": 45,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ct_mv_review.find_one()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'_id': '1292722', 'counter': 342}\n",
      "{'_id': '1291546', 'counter': 327}\n",
      "{'_id': '1295644', 'counter': 359}\n",
      "{'_id': '1295124', 'counter': 359}\n",
      "{'_id': '3011091', 'counter': 344}\n",
      "{'_id': '1292052', 'counter': 312}\n",
      "{'_id': '3541415', 'counter': 343}\n",
      "{'_id': '1292720', 'counter': 358}\n",
      "{'_id': '1292063', 'counter': 360}\n",
      "{'_id': '1291561', 'counter': 328}\n"
     ]
    }
   ],
   "source": [
    "for x in ct_mv_review.aggregate([{'$group': {'_id':'$rv_mv_id', 'counter':{'$sum':1}}}]):\n",
    "    print(x)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Collection(Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True), 'mv'), 'dc_mv_review')"
      ]
     },
     "execution_count": 50,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Building prefix dict from the default dictionary ...\n",
      "Loading model from cache C:\\Users\\uni10\\AppData\\Local\\Temp\\jieba.cache\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[INFO] >> 已统计完 [2] 个电影在2021年1月1后的影评\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Loading model cost 1.042 seconds.\n",
      "Prefix dict has been built successfully.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[INFO] >> 电影[美丽人生] 评论的词频统计词云生成完毕, 保存位置在[./count_images/美丽人生.jpg]\n",
      "[INFO] >> 电影[辛德勒的名单] 评论的词频统计词云生成完毕, 保存位置在[./count_images/辛德勒的名单.jpg]\n"
     ]
    }
   ],
   "source": [
    "from datetime import datetime\n",
    "\n",
    "list_rv= []\n",
    "year, month, day = 2021,1,1\n",
    "for x in ct_mv_review.aggregate([\n",
    "    {\n",
    "        # 转化类型\n",
    "        '$project':\n",
    "        {\n",
    "            'rv_time': '$rv_time',\n",
    "            'rv_info': '$rv_info',\n",
    "            'rv_name': '$rv_name',\n",
    "            'rv_time_stand': \n",
    "            {\n",
    "                '$convert':\n",
    "                {\n",
    "                    'input':'$rv_time',\n",
    "                    'to': 'date',\n",
    "                    'onNull': 'missing rv_time'\n",
    "                }\n",
    "            },\n",
    "        },\n",
    "\n",
    "\n",
    "    },\n",
    "    {\n",
    "        '$match': \n",
    "        {\n",
    "            'rv_time_stand':\n",
    "            {\n",
    "                '$gte': datetime(year,month,day)\n",
    "            },\n",
    "        }\n",
    "    }, \n",
    "\n",
    "]):\n",
    "    list_rv.append(x)\n",
    "\n",
    "dict_rv_info = {}\n",
    "'''\n",
    "    处理 MongoDB 聚合后的结果 汇总评论\n",
    "'''\n",
    "for rv in list_rv:\n",
    "    # 前 7 位是电影的ID\n",
    "    mv_id = rv['_id'][:7]\n",
    "    dict_rv_info[dict_rv_name[mv_id]] = {}\n",
    "    dict_rv_info[dict_rv_name[mv_id]][rv['rv_name']] = {\n",
    "            'rv_time': rv['rv_time'],\n",
    "            'rv_info': rv['rv_info']\n",
    "    }\n",
    "print(f'[INFO] >> 已统计完 [{len(dict_rv_info)}] 个电影在{year}年{month}月{day}后的影评')\n",
    "\n",
    "\n",
    "'''\n",
    "    词频统计\n",
    "'''\n",
    "import jieba\n",
    "from wordcloud import WordCloud\n",
    "# 不需要统计的词汇\n",
    "nope = ['电影', '没有', '一个', '之后', '这部']\n",
    "for k, v in dict_rv_info.items():\n",
    "    dict_word_count = {}\n",
    "    # 遍历每个用户的评论\n",
    "    for review in v.values():\n",
    "        # 遍历每个词\n",
    "        for x in jieba.cut(review['rv_info']):\n",
    "            if(len(x) >= 2) and x not in nope:\n",
    "                dict_word_count.setdefault(x, 0)\n",
    "                dict_word_count[x] = dict_word_count[x] + 1\n",
    "    #生成词云 保存到本地\n",
    "    t = WordCloud(\n",
    "        width=600, height=480,  # 图片大小\n",
    "        background_color='white',  # 背景颜色\n",
    "        scale=10,\n",
    "        font_path=r'c:\\windows\\fonts\\simfang.ttf' ).generate_from_frequencies(dict_word_count)\n",
    "    save_path = './count_images/' + k + '.jpg'\n",
    "    t.to_file(save_path)\n",
    "    print(f'[INFO] >> 电影[{k}] 评论的词频统计词云生成完毕, 保存位置在[{save_path}]')\n",
    "# print(dict_word_count)\n",
    "\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "评论词云的结果通过 img 标签展示，如下图所示："
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h1>1.美丽人生</h1>\n",
    "<img src='./count_images/美丽人生.jpg' width=500px height=400px>\n",
    "\n",
    "<h1>2.辛德勒的名单</h1>\n",
    "<img src='./count_images/辛德勒的名单.jpg' width=500px height=400px>\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "这里展示了两部电影的评论词云，而且是在21年1月份以后的评论，在MongoDB的强大支持下，检索某个日期里的文档数据十分遍历，通过这样的方式，我们能感受到电影从去年到现在的影响力。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": []
  }
 ],
 "metadata": {
  "interpreter": {
   "hash": "cdb7691abdf6f67c8b9253c9b9b05b264d14066a03d4e3c8a4ee6f696976e01e"
  },
  "kernelspec": {
   "display_name": "Python 3.9.7 ('base')",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
